Solved

XSLT Grouping and Aggregation

Posted on 2007-03-19
4
1,279 Views
Last Modified: 2012-08-13
This should be simple but I'm not very strong in XSLT.
I am trying to perform an XSLT on XML returned from a dataset that will give me the same
output as a "Group By" SQL statement combined with the "Sum" statement.

Using the Muenchian method i have been able to group by country but not Sum the Quantities

My XML Structure:

<?xml version="1.0" standalone="yes"?>
<NewDataSet>
  <Table>
    <ReportingDataId>2</ReportingDataId>
    <ReportingBusUnitDesc>United Kingdom</ReportingBusUnitDesc>
    <SystemQuantity>2</SystemQuantity>
    <TotalRevenue>10000</TotalRevenue>
  </Table>
  <Table>
      <ReportingDataId>2</ReportingDataId>
      <ReportingBusUnitDesc>United Kingdom</ReportingBusUnitDesc>
      <SystemQuantity>2</SystemQuantity>
      <TotalRevenue>10000</TotalRevenue>
  </Table>
  <Table>
      <ReportingDataId>2</ReportingDataId>
      <ReportingBusUnitDesc>Germany</ReportingBusUnitDesc>
      <SystemQuantity>4</SystemQuantity>
      <TotalRevenue>20000</TotalRevenue>
  </Table>
  <Table>
      <ReportingDataId>2</ReportingDataId>
      <ReportingBusUnitDesc>Germany</ReportingBusUnitDesc>
      <SystemQuantity>4</SystemQuantity>
      <TotalRevenue>20000</TotalRevenue>
  </Table>
</DataSet>

My XSL:
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform ">

  <xsl:key name="countryKey" match="Table" use="ReportingBusUnitDesc"/>

  <xsl:template match="/">
    <table border="0" cellpadding="0" cellspacing="0" width="1000">
      <tr>
        <td class="header">
          Country
        </td>
        <td class="header">
          units
        </td>
        <td class="header">
          revenue
        </td>
      </tr>
     
      <xsl:call-template name="dataTableGroup" />
      <!-- Sum the Totals of quantities and revenue-->
      <xsl:call-template name="dataTotals" />
    </table>
  </xsl:template>

  <xsl:template name="dataTableGroup">
    <xsl:for-each select="NewDataSet/Table[generate-id()=generate-id(key('countryKey',ReportingBusUnitDesc))]">
      <xsl:sort select="ReportingBusUnitDesc"/>

      <tr>
        <td class="columnTextRight">
          <xsl:value-of select="ReportingBusUnitDesc"/>
        </td>
        <td class="columnData">
          <xsl:value-of select="sum(SystemQuantity[../ReportingBusUnitDesc=current()])"/>
        </td>
        <td class="columnData">
          $
        </td>
      </tr>

    </xsl:for-each>
  </xsl:template>
 
  <xsl:template name="dataTotals">
    <tr>
      <td class="footer">
        Total
      </td>
      <td class="footer" >
        <xsl:value-of select="sum(/NewDataSet/Table/SystemQuantity)" />
      </td>
      <td class="footer">
        $<xsl:value-of select="sum(/NewDataSet/Table/TotalRevenue)" />
      </td>
    </tr>
  </xsl:template>
</xsl:stylesheet>

My desired output is:

Country                   Quantity       Revenue
Germany                     8               40000
United Kingdom           4               20000
Total                           12              60000

If anyone could set me on the right path I would be very greatful.
0
Comment
Question by:johnaryan
  • 2
  • 2
4 Comments
 
LVL 5

Author Comment

by:johnaryan
Comment Utility
I got this idea from another post on EE, it does the job....
I'm still looking the optimal performance out of this., so rather than give myself the points I'm leaving the post open.

My solution......

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:key name="countryKey" match="Table" use="ReportingBusUnitDesc"/>
  <xsl:template match="/">
    <table border="0" cellpadding="0" cellspacing="0" width="600">
      <tr>
        <td class="header">
          Country
        </td>
        <td class="header">
          units
        </td>
        <td class="header">
          revenue
        </td>
      </tr>
      <xsl:call-template name="dataTableGroup" />
      <!-- Sum the Totals of quantities and revenue-->
      <xsl:call-template name="dataTotals" />
    </table>
  </xsl:template>

  <xsl:template name="dataTableGroup">
    <xsl:for-each select="NewDataSet/Table[generate-id()=generate-id(key('countryKey',ReportingBusUnitDesc))]">
      <xsl:sort select="ReportingBusUnitDesc"/>

      <tr>
        <td class="columnTextRight">
          <xsl:value-of select="ReportingBusUnitDesc"/>
        </td>
        <td class="columnData">
          <xsl:value-of select="sum(key('countryKey',ReportingBusUnitDesc)/SystemQuantity)"/>
        </td>
        <td class="columnData">
          $<xsl:value-of select="sum(key('countryKey',ReportingBusUnitDesc)/TotalRevenue)"/>
        </td>
      </tr>

    </xsl:for-each>
  </xsl:template>

  <xsl:template name="dataTotals">
    <tr>
      <td class="footer">
        Total
      </td>
      <td class="footer" >
        <xsl:value-of select="sum(/NewDataSet/Table/SystemQuantity)" />
      </td>
      <td class="footer">
        $<xsl:value-of select="sum(/NewDataSet/Table/TotalRevenue)" />
      </td>
    </tr>
  </xsl:template>
</xsl:stylesheet>
0
 
LVL 60

Accepted Solution

by:
Geert Bormans earned 500 total points
Comment Utility
Hi,

there is not too much to add to your solution.
By using key() in the sum, you must have optimised that part: sum(key('countryKey',ReportingBusUnitDesc)/SystemQuantity)

Now to me the most expensive part left, regarding performance is calculating the totals from the entire nodeset.
You could set up a construction that makes an recursive walk through the unique nodes,
keeping a count of the sums... but it make syour code less readible,
and I am not sure you will win a lot... the gain would be very processor dependent

I don't know how big your set is and how much improvement you need.
I always try to write decent code, without optimisations to the extreme, and then make a check first wheither the load requires further optimisations.

Just in case you want to compare

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:key name="countryKey" match="Table" use="ReportingBusUnitDesc"/>
    <xsl:template match="/">
        <table border="0" cellpadding="0" cellspacing="0" width="600">
            <tr>
                <td class="header">
                    Country
                </td>
                <td class="header">
                    units
                </td>
                <td class="header">
                    revenue
                </td>
            </tr>
            <xsl:call-template name="processUniqueTables" >
                <xsl:with-param name="tableNode" select="NewDataSet/Table[generate-id()=generate-id(key('countryKey',ReportingBusUnitDesc)[1])][1]"/>
                <xsl:with-param name="runningTotal">0</xsl:with-param>
                <xsl:with-param name="runningDollarTotal">0</xsl:with-param>
            </xsl:call-template>
        </table>
    </xsl:template>
   
    <xsl:template name="processUniqueTables">
        <xsl:param name="tableNode"/>
        <xsl:param name="runningTotal"/>
        <xsl:param name="runningDollarTotal"/>
       
        <xsl:param name="thisSum" select="sum(key('countryKey',$tableNode/ReportingBusUnitDesc)/SystemQuantity)"></xsl:param>
        <xsl:param name="thisDollarSum" select="sum(key('countryKey',$tableNode/ReportingBusUnitDesc)/TotalRevenue)"/>
        <tr>
            <td class="columnTextRight">
                <xsl:value-of select="$tableNode/ReportingBusUnitDesc"/>
            </td>
            <td class="columnData">
                <xsl:value-of select="$thisSum"/>
            </td>
            <td class="columnData">
                <xsl:text>$</xsl:text>
                <xsl:value-of select="$thisDollarSum"/>
            </td>
        </tr>
        <xsl:choose>
            <xsl:when test="$tableNode/following-sibling::Table[generate-id()=generate-id(key('countryKey',ReportingBusUnitDesc)[1])]">
                <xsl:call-template name="processUniqueTables" >
                    <xsl:with-param name="tableNode" select="$tableNode/following-sibling::Table[generate-id()=generate-id(key('countryKey',ReportingBusUnitDesc)[1])][1]"/>
                    <xsl:with-param name="runningTotal" select="$thisSum + $runningTotal"/>
                    <xsl:with-param name="runningDollarTotal"  select="$thisDollarSum + $runningDollarTotal"/>
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <!-- Sum the Totals of quantities and revenue-->
                <xsl:call-template name="dataTotals" >
                    <xsl:with-param name="total" select="$thisSum + $runningTotal"/>
                    <xsl:with-param name="dollarTotal" select="$thisDollarSum + $runningDollarTotal"/>
                </xsl:call-template>
            </xsl:otherwise>
        </xsl:choose>
       
    </xsl:template>
   
    <xsl:template name="dataTotals">
        <xsl:param name="dollarTotal"/>
        <xsl:param name="total"/>
        <tr>
            <td class="footer">
                Total
            </td>
            <td class="footer" >
                <xsl:value-of select="$total" />
            </td>
            <td class="footer">
                <xsl:text>$</xsl:text>
                <xsl:value-of select="$dollarTotal" />
            </td>
        </tr>
    </xsl:template>
</xsl:stylesheet>

With this solution, you loose the sorting,
because you iterate over the unique tables in document order,
you also loose clarity... it takes an experienced XSLT programmer to see instantly what is happening
but you also loose the access to all Tables starting from the root in the sum
It will depend on your processor implementation whether this would be faster
... but it will be at least a little (measured a difference of 10-15% on Saxon with a small dataset)

cheers

Geert
0
 
LVL 5

Author Comment

by:johnaryan
Comment Utility
Thanks Geert, I do appreciate the input.
I don't need too much optimizations as it seems the dataset will only be in the order of 10000 rows. And at the moment my sample set of 4000 performs adequately.
0
 
LVL 60

Expert Comment

by:Geert Bormans
Comment Utility
cheers
0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

This article describes relatively difficult and non-obvious issues that are likely to arise when creating COM class in Visual Studio and deploying it by professional MSI-authoring tools. It is assumed that the reader is already familiar with the cla…
A quick way to get a menu to work on our website, is using the Menu control and assign it to a web.sitemap using SiteMapDataSource. Example of web.sitemap file: (CODE) Sample code to add to the page menu: (CODE) Running the application, we wi…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now