Solved

XSLT Grouping and Aggregation

Posted on 2007-03-19
4
1,281 Views
Last Modified: 2012-08-13
This should be simple but I'm not very strong in XSLT.
I am trying to perform an XSLT on XML returned from a dataset that will give me the same
output as a "Group By" SQL statement combined with the "Sum" statement.

Using the Muenchian method i have been able to group by country but not Sum the Quantities

My XML Structure:

<?xml version="1.0" standalone="yes"?>
<NewDataSet>
  <Table>
    <ReportingDataId>2</ReportingDataId>
    <ReportingBusUnitDesc>United Kingdom</ReportingBusUnitDesc>
    <SystemQuantity>2</SystemQuantity>
    <TotalRevenue>10000</TotalRevenue>
  </Table>
  <Table>
      <ReportingDataId>2</ReportingDataId>
      <ReportingBusUnitDesc>United Kingdom</ReportingBusUnitDesc>
      <SystemQuantity>2</SystemQuantity>
      <TotalRevenue>10000</TotalRevenue>
  </Table>
  <Table>
      <ReportingDataId>2</ReportingDataId>
      <ReportingBusUnitDesc>Germany</ReportingBusUnitDesc>
      <SystemQuantity>4</SystemQuantity>
      <TotalRevenue>20000</TotalRevenue>
  </Table>
  <Table>
      <ReportingDataId>2</ReportingDataId>
      <ReportingBusUnitDesc>Germany</ReportingBusUnitDesc>
      <SystemQuantity>4</SystemQuantity>
      <TotalRevenue>20000</TotalRevenue>
  </Table>
</DataSet>

My XSL:
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform ">

  <xsl:key name="countryKey" match="Table" use="ReportingBusUnitDesc"/>

  <xsl:template match="/">
    <table border="0" cellpadding="0" cellspacing="0" width="1000">
      <tr>
        <td class="header">
          Country
        </td>
        <td class="header">
          units
        </td>
        <td class="header">
          revenue
        </td>
      </tr>
     
      <xsl:call-template name="dataTableGroup" />
      <!-- Sum the Totals of quantities and revenue-->
      <xsl:call-template name="dataTotals" />
    </table>
  </xsl:template>

  <xsl:template name="dataTableGroup">
    <xsl:for-each select="NewDataSet/Table[generate-id()=generate-id(key('countryKey',ReportingBusUnitDesc))]">
      <xsl:sort select="ReportingBusUnitDesc"/>

      <tr>
        <td class="columnTextRight">
          <xsl:value-of select="ReportingBusUnitDesc"/>
        </td>
        <td class="columnData">
          <xsl:value-of select="sum(SystemQuantity[../ReportingBusUnitDesc=current()])"/>
        </td>
        <td class="columnData">
          $
        </td>
      </tr>

    </xsl:for-each>
  </xsl:template>
 
  <xsl:template name="dataTotals">
    <tr>
      <td class="footer">
        Total
      </td>
      <td class="footer" >
        <xsl:value-of select="sum(/NewDataSet/Table/SystemQuantity)" />
      </td>
      <td class="footer">
        $<xsl:value-of select="sum(/NewDataSet/Table/TotalRevenue)" />
      </td>
    </tr>
  </xsl:template>
</xsl:stylesheet>

My desired output is:

Country                   Quantity       Revenue
Germany                     8               40000
United Kingdom           4               20000
Total                           12              60000

If anyone could set me on the right path I would be very greatful.
0
Comment
Question by:johnaryan
  • 2
  • 2
4 Comments
 
LVL 5

Author Comment

by:johnaryan
ID: 18749073
I got this idea from another post on EE, it does the job....
I'm still looking the optimal performance out of this., so rather than give myself the points I'm leaving the post open.

My solution......

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:key name="countryKey" match="Table" use="ReportingBusUnitDesc"/>
  <xsl:template match="/">
    <table border="0" cellpadding="0" cellspacing="0" width="600">
      <tr>
        <td class="header">
          Country
        </td>
        <td class="header">
          units
        </td>
        <td class="header">
          revenue
        </td>
      </tr>
      <xsl:call-template name="dataTableGroup" />
      <!-- Sum the Totals of quantities and revenue-->
      <xsl:call-template name="dataTotals" />
    </table>
  </xsl:template>

  <xsl:template name="dataTableGroup">
    <xsl:for-each select="NewDataSet/Table[generate-id()=generate-id(key('countryKey',ReportingBusUnitDesc))]">
      <xsl:sort select="ReportingBusUnitDesc"/>

      <tr>
        <td class="columnTextRight">
          <xsl:value-of select="ReportingBusUnitDesc"/>
        </td>
        <td class="columnData">
          <xsl:value-of select="sum(key('countryKey',ReportingBusUnitDesc)/SystemQuantity)"/>
        </td>
        <td class="columnData">
          $<xsl:value-of select="sum(key('countryKey',ReportingBusUnitDesc)/TotalRevenue)"/>
        </td>
      </tr>

    </xsl:for-each>
  </xsl:template>

  <xsl:template name="dataTotals">
    <tr>
      <td class="footer">
        Total
      </td>
      <td class="footer" >
        <xsl:value-of select="sum(/NewDataSet/Table/SystemQuantity)" />
      </td>
      <td class="footer">
        $<xsl:value-of select="sum(/NewDataSet/Table/TotalRevenue)" />
      </td>
    </tr>
  </xsl:template>
</xsl:stylesheet>
0
 
LVL 60

Accepted Solution

by:
Geert Bormans earned 500 total points
ID: 18752473
Hi,

there is not too much to add to your solution.
By using key() in the sum, you must have optimised that part: sum(key('countryKey',ReportingBusUnitDesc)/SystemQuantity)

Now to me the most expensive part left, regarding performance is calculating the totals from the entire nodeset.
You could set up a construction that makes an recursive walk through the unique nodes,
keeping a count of the sums... but it make syour code less readible,
and I am not sure you will win a lot... the gain would be very processor dependent

I don't know how big your set is and how much improvement you need.
I always try to write decent code, without optimisations to the extreme, and then make a check first wheither the load requires further optimisations.

Just in case you want to compare

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:key name="countryKey" match="Table" use="ReportingBusUnitDesc"/>
    <xsl:template match="/">
        <table border="0" cellpadding="0" cellspacing="0" width="600">
            <tr>
                <td class="header">
                    Country
                </td>
                <td class="header">
                    units
                </td>
                <td class="header">
                    revenue
                </td>
            </tr>
            <xsl:call-template name="processUniqueTables" >
                <xsl:with-param name="tableNode" select="NewDataSet/Table[generate-id()=generate-id(key('countryKey',ReportingBusUnitDesc)[1])][1]"/>
                <xsl:with-param name="runningTotal">0</xsl:with-param>
                <xsl:with-param name="runningDollarTotal">0</xsl:with-param>
            </xsl:call-template>
        </table>
    </xsl:template>
   
    <xsl:template name="processUniqueTables">
        <xsl:param name="tableNode"/>
        <xsl:param name="runningTotal"/>
        <xsl:param name="runningDollarTotal"/>
       
        <xsl:param name="thisSum" select="sum(key('countryKey',$tableNode/ReportingBusUnitDesc)/SystemQuantity)"></xsl:param>
        <xsl:param name="thisDollarSum" select="sum(key('countryKey',$tableNode/ReportingBusUnitDesc)/TotalRevenue)"/>
        <tr>
            <td class="columnTextRight">
                <xsl:value-of select="$tableNode/ReportingBusUnitDesc"/>
            </td>
            <td class="columnData">
                <xsl:value-of select="$thisSum"/>
            </td>
            <td class="columnData">
                <xsl:text>$</xsl:text>
                <xsl:value-of select="$thisDollarSum"/>
            </td>
        </tr>
        <xsl:choose>
            <xsl:when test="$tableNode/following-sibling::Table[generate-id()=generate-id(key('countryKey',ReportingBusUnitDesc)[1])]">
                <xsl:call-template name="processUniqueTables" >
                    <xsl:with-param name="tableNode" select="$tableNode/following-sibling::Table[generate-id()=generate-id(key('countryKey',ReportingBusUnitDesc)[1])][1]"/>
                    <xsl:with-param name="runningTotal" select="$thisSum + $runningTotal"/>
                    <xsl:with-param name="runningDollarTotal"  select="$thisDollarSum + $runningDollarTotal"/>
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <!-- Sum the Totals of quantities and revenue-->
                <xsl:call-template name="dataTotals" >
                    <xsl:with-param name="total" select="$thisSum + $runningTotal"/>
                    <xsl:with-param name="dollarTotal" select="$thisDollarSum + $runningDollarTotal"/>
                </xsl:call-template>
            </xsl:otherwise>
        </xsl:choose>
       
    </xsl:template>
   
    <xsl:template name="dataTotals">
        <xsl:param name="dollarTotal"/>
        <xsl:param name="total"/>
        <tr>
            <td class="footer">
                Total
            </td>
            <td class="footer" >
                <xsl:value-of select="$total" />
            </td>
            <td class="footer">
                <xsl:text>$</xsl:text>
                <xsl:value-of select="$dollarTotal" />
            </td>
        </tr>
    </xsl:template>
</xsl:stylesheet>

With this solution, you loose the sorting,
because you iterate over the unique tables in document order,
you also loose clarity... it takes an experienced XSLT programmer to see instantly what is happening
but you also loose the access to all Tables starting from the root in the sum
It will depend on your processor implementation whether this would be faster
... but it will be at least a little (measured a difference of 10-15% on Saxon with a small dataset)

cheers

Geert
0
 
LVL 5

Author Comment

by:johnaryan
ID: 18758066
Thanks Geert, I do appreciate the input.
I don't need too much optimizations as it seems the dataset will only be in the order of 10000 rows. And at the moment my sample set of 4000 performs adequately.
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 18758508
cheers
0

Featured Post

Master Your Team's Linux and Cloud Stack

Come see why top tech companies like Mailchimp and Media Temple use Linux Academy to build their employee training programs.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

For those of you who don't follow the news, or just happen to live under rocks, Microsoft Research released a beta SDK (http://www.microsoft.com/en-us/download/details.aspx?id=27876) for the Xbox 360 Kinect. If you don't know what a Kinect is (http:…
Problem Hi all,    While many today have fast Internet connection, there are many still who do not, or are connecting through devices with a slower connect, so light web pages and fast load times are still popular.    If your ASP.NET page …
In a recent question (https://www.experts-exchange.com/questions/28997919/Pagination-in-Adobe-Acrobat.html) here at Experts Exchange, a member asked how to add page numbers to a PDF file using Adobe Acrobat XI Pro. This short video Micro Tutorial sh…
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…

815 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now