Solved

XSLT Grouping and Aggregation

Posted on 2007-03-19
4
1,280 Views
Last Modified: 2012-08-13
This should be simple but I'm not very strong in XSLT.
I am trying to perform an XSLT on XML returned from a dataset that will give me the same
output as a "Group By" SQL statement combined with the "Sum" statement.

Using the Muenchian method i have been able to group by country but not Sum the Quantities

My XML Structure:

<?xml version="1.0" standalone="yes"?>
<NewDataSet>
  <Table>
    <ReportingDataId>2</ReportingDataId>
    <ReportingBusUnitDesc>United Kingdom</ReportingBusUnitDesc>
    <SystemQuantity>2</SystemQuantity>
    <TotalRevenue>10000</TotalRevenue>
  </Table>
  <Table>
      <ReportingDataId>2</ReportingDataId>
      <ReportingBusUnitDesc>United Kingdom</ReportingBusUnitDesc>
      <SystemQuantity>2</SystemQuantity>
      <TotalRevenue>10000</TotalRevenue>
  </Table>
  <Table>
      <ReportingDataId>2</ReportingDataId>
      <ReportingBusUnitDesc>Germany</ReportingBusUnitDesc>
      <SystemQuantity>4</SystemQuantity>
      <TotalRevenue>20000</TotalRevenue>
  </Table>
  <Table>
      <ReportingDataId>2</ReportingDataId>
      <ReportingBusUnitDesc>Germany</ReportingBusUnitDesc>
      <SystemQuantity>4</SystemQuantity>
      <TotalRevenue>20000</TotalRevenue>
  </Table>
</DataSet>

My XSL:
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform ">

  <xsl:key name="countryKey" match="Table" use="ReportingBusUnitDesc"/>

  <xsl:template match="/">
    <table border="0" cellpadding="0" cellspacing="0" width="1000">
      <tr>
        <td class="header">
          Country
        </td>
        <td class="header">
          units
        </td>
        <td class="header">
          revenue
        </td>
      </tr>
     
      <xsl:call-template name="dataTableGroup" />
      <!-- Sum the Totals of quantities and revenue-->
      <xsl:call-template name="dataTotals" />
    </table>
  </xsl:template>

  <xsl:template name="dataTableGroup">
    <xsl:for-each select="NewDataSet/Table[generate-id()=generate-id(key('countryKey',ReportingBusUnitDesc))]">
      <xsl:sort select="ReportingBusUnitDesc"/>

      <tr>
        <td class="columnTextRight">
          <xsl:value-of select="ReportingBusUnitDesc"/>
        </td>
        <td class="columnData">
          <xsl:value-of select="sum(SystemQuantity[../ReportingBusUnitDesc=current()])"/>
        </td>
        <td class="columnData">
          $
        </td>
      </tr>

    </xsl:for-each>
  </xsl:template>
 
  <xsl:template name="dataTotals">
    <tr>
      <td class="footer">
        Total
      </td>
      <td class="footer" >
        <xsl:value-of select="sum(/NewDataSet/Table/SystemQuantity)" />
      </td>
      <td class="footer">
        $<xsl:value-of select="sum(/NewDataSet/Table/TotalRevenue)" />
      </td>
    </tr>
  </xsl:template>
</xsl:stylesheet>

My desired output is:

Country                   Quantity       Revenue
Germany                     8               40000
United Kingdom           4               20000
Total                           12              60000

If anyone could set me on the right path I would be very greatful.
0
Comment
Question by:johnaryan
  • 2
  • 2
4 Comments
 
LVL 5

Author Comment

by:johnaryan
ID: 18749073
I got this idea from another post on EE, it does the job....
I'm still looking the optimal performance out of this., so rather than give myself the points I'm leaving the post open.

My solution......

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:key name="countryKey" match="Table" use="ReportingBusUnitDesc"/>
  <xsl:template match="/">
    <table border="0" cellpadding="0" cellspacing="0" width="600">
      <tr>
        <td class="header">
          Country
        </td>
        <td class="header">
          units
        </td>
        <td class="header">
          revenue
        </td>
      </tr>
      <xsl:call-template name="dataTableGroup" />
      <!-- Sum the Totals of quantities and revenue-->
      <xsl:call-template name="dataTotals" />
    </table>
  </xsl:template>

  <xsl:template name="dataTableGroup">
    <xsl:for-each select="NewDataSet/Table[generate-id()=generate-id(key('countryKey',ReportingBusUnitDesc))]">
      <xsl:sort select="ReportingBusUnitDesc"/>

      <tr>
        <td class="columnTextRight">
          <xsl:value-of select="ReportingBusUnitDesc"/>
        </td>
        <td class="columnData">
          <xsl:value-of select="sum(key('countryKey',ReportingBusUnitDesc)/SystemQuantity)"/>
        </td>
        <td class="columnData">
          $<xsl:value-of select="sum(key('countryKey',ReportingBusUnitDesc)/TotalRevenue)"/>
        </td>
      </tr>

    </xsl:for-each>
  </xsl:template>

  <xsl:template name="dataTotals">
    <tr>
      <td class="footer">
        Total
      </td>
      <td class="footer" >
        <xsl:value-of select="sum(/NewDataSet/Table/SystemQuantity)" />
      </td>
      <td class="footer">
        $<xsl:value-of select="sum(/NewDataSet/Table/TotalRevenue)" />
      </td>
    </tr>
  </xsl:template>
</xsl:stylesheet>
0
 
LVL 60

Accepted Solution

by:
Geert Bormans earned 500 total points
ID: 18752473
Hi,

there is not too much to add to your solution.
By using key() in the sum, you must have optimised that part: sum(key('countryKey',ReportingBusUnitDesc)/SystemQuantity)

Now to me the most expensive part left, regarding performance is calculating the totals from the entire nodeset.
You could set up a construction that makes an recursive walk through the unique nodes,
keeping a count of the sums... but it make syour code less readible,
and I am not sure you will win a lot... the gain would be very processor dependent

I don't know how big your set is and how much improvement you need.
I always try to write decent code, without optimisations to the extreme, and then make a check first wheither the load requires further optimisations.

Just in case you want to compare

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:key name="countryKey" match="Table" use="ReportingBusUnitDesc"/>
    <xsl:template match="/">
        <table border="0" cellpadding="0" cellspacing="0" width="600">
            <tr>
                <td class="header">
                    Country
                </td>
                <td class="header">
                    units
                </td>
                <td class="header">
                    revenue
                </td>
            </tr>
            <xsl:call-template name="processUniqueTables" >
                <xsl:with-param name="tableNode" select="NewDataSet/Table[generate-id()=generate-id(key('countryKey',ReportingBusUnitDesc)[1])][1]"/>
                <xsl:with-param name="runningTotal">0</xsl:with-param>
                <xsl:with-param name="runningDollarTotal">0</xsl:with-param>
            </xsl:call-template>
        </table>
    </xsl:template>
   
    <xsl:template name="processUniqueTables">
        <xsl:param name="tableNode"/>
        <xsl:param name="runningTotal"/>
        <xsl:param name="runningDollarTotal"/>
       
        <xsl:param name="thisSum" select="sum(key('countryKey',$tableNode/ReportingBusUnitDesc)/SystemQuantity)"></xsl:param>
        <xsl:param name="thisDollarSum" select="sum(key('countryKey',$tableNode/ReportingBusUnitDesc)/TotalRevenue)"/>
        <tr>
            <td class="columnTextRight">
                <xsl:value-of select="$tableNode/ReportingBusUnitDesc"/>
            </td>
            <td class="columnData">
                <xsl:value-of select="$thisSum"/>
            </td>
            <td class="columnData">
                <xsl:text>$</xsl:text>
                <xsl:value-of select="$thisDollarSum"/>
            </td>
        </tr>
        <xsl:choose>
            <xsl:when test="$tableNode/following-sibling::Table[generate-id()=generate-id(key('countryKey',ReportingBusUnitDesc)[1])]">
                <xsl:call-template name="processUniqueTables" >
                    <xsl:with-param name="tableNode" select="$tableNode/following-sibling::Table[generate-id()=generate-id(key('countryKey',ReportingBusUnitDesc)[1])][1]"/>
                    <xsl:with-param name="runningTotal" select="$thisSum + $runningTotal"/>
                    <xsl:with-param name="runningDollarTotal"  select="$thisDollarSum + $runningDollarTotal"/>
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <!-- Sum the Totals of quantities and revenue-->
                <xsl:call-template name="dataTotals" >
                    <xsl:with-param name="total" select="$thisSum + $runningTotal"/>
                    <xsl:with-param name="dollarTotal" select="$thisDollarSum + $runningDollarTotal"/>
                </xsl:call-template>
            </xsl:otherwise>
        </xsl:choose>
       
    </xsl:template>
   
    <xsl:template name="dataTotals">
        <xsl:param name="dollarTotal"/>
        <xsl:param name="total"/>
        <tr>
            <td class="footer">
                Total
            </td>
            <td class="footer" >
                <xsl:value-of select="$total" />
            </td>
            <td class="footer">
                <xsl:text>$</xsl:text>
                <xsl:value-of select="$dollarTotal" />
            </td>
        </tr>
    </xsl:template>
</xsl:stylesheet>

With this solution, you loose the sorting,
because you iterate over the unique tables in document order,
you also loose clarity... it takes an experienced XSLT programmer to see instantly what is happening
but you also loose the access to all Tables starting from the root in the sum
It will depend on your processor implementation whether this would be faster
... but it will be at least a little (measured a difference of 10-15% on Saxon with a small dataset)

cheers

Geert
0
 
LVL 5

Author Comment

by:johnaryan
ID: 18758066
Thanks Geert, I do appreciate the input.
I don't need too much optimizations as it seems the dataset will only be in the order of 10000 rows. And at the moment my sample set of 4000 performs adequately.
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 18758508
cheers
0

Featured Post

3 Use Cases for Connected Systems

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, testing some more, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

User art_snob (http://www.experts-exchange.com/M_6114203.html) encountered strange behavior of Android Web browser on his Mobile Web site. It took a while to find the true cause. It happens so, that the Android Web browser (at least up to OS ver. 2.…
Wouldn’t it be nice if you could test whether an element is contained in an array by using a Contains method just like the one available on List objects? Wouldn’t it be good if you could write code like this? (CODE) In .NET 3.5, this is possible…
This Micro Tutorial will give you a basic overview how to record your screen with Microsoft Expression Encoder. This program is still free and open for the public to download. This will be demonstrated using Microsoft Expression Encoder 4.
Learn how to create flexible layouts using relative units in CSS.  New relative units added in CSS3 include vw(viewports width), vh(viewports height), vmin(minimum of viewports height and width), and vmax (maximum of viewports height and width).

896 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now