• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 309
  • Last Modified:

Muenchian method - Unique Values

I am aware that there is a technique called the Muenchian method  for selecting unique values from an unsorted source document - eg select the unique InvoiceID values from:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="G:\My Documents\Xml Projects\People\UnSortedInvoiceLines.xslt"?>
<InvoiceLineItems>
      <LineItem InvoiceID="0000002">
            <ItemID>DDE111</ItemID>
            <Quantity>2</Quantity>
      </LineItem>
      <LineItem InvoiceID="0000001">
            <ItemID>BCC002</ItemID>
            <Quantity>12</Quantity>
      </LineItem>
      <LineItem InvoiceID="0000003">
            <ItemID>BBB002</ItemID>
            <Quantity>5</Quantity>
      </LineItem>      
      <LineItem InvoiceID="0000002">
            <ItemID>CCD344</ItemID>
            <Quantity>1</Quantity>
      </LineItem>
      <LineItem InvoiceID="0000001">
            <ItemID>AAA001</ItemID>
            <Quantity>23</Quantity>
      </LineItem>      
      <LineItem InvoiceID="0000002">
            <ItemID>AAA003</ItemID>
            <Quantity>4</Quantity>
      </LineItem>
</InvoiceLineItems>


Can someone show me how to do this and explain how it actually works?

Thanks

0
daveamour
Asked:
daveamour
  • 5
  • 3
1 Solution
 
Geert BormansCommented:
Hi daveamour,

Muenchian method is more about sorting and grouping
here is a good explanation
http://www.jenitennison.com/xslt/grouping/muenchian.xml

Muench uses keys and generate-id to get the unique values
but there is a more lightway approach as well

I will show you one that could be easily understood in your example
and show the Muenchian in the next post

this example checks for every LineItem in the for-each, wheither an earlier LineItem had the same @InvoiceID

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:template match="InvoiceLineItems">
        <xsl:for-each select="LineItem[not(@InvoiceID = preceding-sibling::LineItem/@InvoiceID)]">
            <xsl:copy-of select="."/>
        </xsl:for-each>
    </xsl:template>
</xsl:stylesheet>

Cheers!
0
 
Geert BormansCommented:
daveamour,

and here is the key version, as used by Muench

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:key name="LineItem-by-ID" match="LineItem" use="@InvoiceID" />
    <xsl:template match="InvoiceLineItems">
        <xsl:for-each select="LineItem[generate-id() = generate-id(key('LineItem-by-ID', @InvoiceID)[1])]">
            <xsl:copy-of select="."/>
        </xsl:for-each>
    </xsl:template>
</xsl:stylesheet>

0
 
Geert BormansCommented:
daveamour,

first you create a key
<xsl:key name="LineItem-by-ID" match="LineItem" use="@InvoiceID" />

this key enables you to select a LineItem(through the "match" attribute), based on the @InvoiceID (through the "use" attribute)
(it is a bit like an index in a database)

if you now generate an id of a LineItem node (the generated id of a node is unpredictable, but in one process it would be the same every time you calculate it)
and you generate the id of the first LineItem node having the same @InvoiceID (using the key)
than you can compare generated ids... if they are the same, you know it is the same node... or the first distinct LineItem with this particular @InvoiceID

that is what happens here
select="LineItem[generate-id() = generate-id(key('LineItem-by-ID', @InvoiceID)[1])]"

hope this helps

cheers
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
daveamourAuthor Commented:
Ok thanks Gertone.  I have a few questions if thats ok?  With your first example I had tried something simillar but with sorting and checking against the preceding-sibling but this didn't work.

Anyway I got the impression that preceding-sibling::LineItem/@InvoiceID in this case returns all preceding siblings.  In this case then the = operator is not behaving as one would normally expect.  Rather than a 1 to 1 equality test it seems to be working as a 1 to many equality check.  Does this make sense and is this correct?

Thanks

Dave
0
 
daveamourAuthor Commented:
Another thing:

You use:

<xsl:for-each select="LineItem[not(@InvoiceID = preceding-sibling::LineItem/@InvoiceID)]">

Earlier I tried:

<xsl:for-each select="LineItem[@InvoiceID != preceding-sibling::LineItem/@InvoiceID]">

which doesn't seem to mean the same thing.  How does this work?

Cheers

Dave
0
 
Geert BormansCommented:
> preceding-sibling::LineItem/@InvoiceID in this case returns all preceding siblings
correct, it returns a node-set of all the nodes that are preceding siblings named LineItem

> it seems to be working as a 1 to many equality check
Actually it returns all the nodes for which this condition applies... so yes it is a 1 to many equality check

> select="LineItem[not(@InvoiceID = preceding-sibling::LineItem/@InvoiceID)]">
selects the LineItem elements that have an @InvoiceID that is not equal to any of the preceding sibling attributes @InvoiceID

> select="LineItem[@InvoiceID != preceding-sibling::LineItem/@InvoiceID]">
select the LineItem elements that have an @InvoiceID that is not equal to at least one of the preceding sibling attributes @InvoiceID
0
 
daveamourAuthor Commented:
Ok thanks, this is a lot to take in!

Thanks very much - I shall go away and study this.

Cheers

Dave


0
 
Geert BormansCommented:
welcome
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 5
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now