?
Solved

XSLT Grouping and Aggregation

Posted on 2007-03-19
4
Medium Priority
?
1,286 Views
Last Modified: 2012-08-13
This should be simple but I'm not very strong in XSLT.
I am trying to perform an XSLT on XML returned from a dataset that will give me the same
output as a "Group By" SQL statement combined with the "Sum" statement.

Using the Muenchian method i have been able to group by country but not Sum the Quantities

My XML Structure:

<?xml version="1.0" standalone="yes"?>
<NewDataSet>
  <Table>
    <ReportingDataId>2</ReportingDataId>
    <ReportingBusUnitDesc>United Kingdom</ReportingBusUnitDesc>
    <SystemQuantity>2</SystemQuantity>
    <TotalRevenue>10000</TotalRevenue>
  </Table>
  <Table>
      <ReportingDataId>2</ReportingDataId>
      <ReportingBusUnitDesc>United Kingdom</ReportingBusUnitDesc>
      <SystemQuantity>2</SystemQuantity>
      <TotalRevenue>10000</TotalRevenue>
  </Table>
  <Table>
      <ReportingDataId>2</ReportingDataId>
      <ReportingBusUnitDesc>Germany</ReportingBusUnitDesc>
      <SystemQuantity>4</SystemQuantity>
      <TotalRevenue>20000</TotalRevenue>
  </Table>
  <Table>
      <ReportingDataId>2</ReportingDataId>
      <ReportingBusUnitDesc>Germany</ReportingBusUnitDesc>
      <SystemQuantity>4</SystemQuantity>
      <TotalRevenue>20000</TotalRevenue>
  </Table>
</DataSet>

My XSL:
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform ">

  <xsl:key name="countryKey" match="Table" use="ReportingBusUnitDesc"/>

  <xsl:template match="/">
    <table border="0" cellpadding="0" cellspacing="0" width="1000">
      <tr>
        <td class="header">
          Country
        </td>
        <td class="header">
          units
        </td>
        <td class="header">
          revenue
        </td>
      </tr>
     
      <xsl:call-template name="dataTableGroup" />
      <!-- Sum the Totals of quantities and revenue-->
      <xsl:call-template name="dataTotals" />
    </table>
  </xsl:template>

  <xsl:template name="dataTableGroup">
    <xsl:for-each select="NewDataSet/Table[generate-id()=generate-id(key('countryKey',ReportingBusUnitDesc))]">
      <xsl:sort select="ReportingBusUnitDesc"/>

      <tr>
        <td class="columnTextRight">
          <xsl:value-of select="ReportingBusUnitDesc"/>
        </td>
        <td class="columnData">
          <xsl:value-of select="sum(SystemQuantity[../ReportingBusUnitDesc=current()])"/>
        </td>
        <td class="columnData">
          $
        </td>
      </tr>

    </xsl:for-each>
  </xsl:template>
 
  <xsl:template name="dataTotals">
    <tr>
      <td class="footer">
        Total
      </td>
      <td class="footer" >
        <xsl:value-of select="sum(/NewDataSet/Table/SystemQuantity)" />
      </td>
      <td class="footer">
        $<xsl:value-of select="sum(/NewDataSet/Table/TotalRevenue)" />
      </td>
    </tr>
  </xsl:template>
</xsl:stylesheet>

My desired output is:

Country                   Quantity       Revenue
Germany                     8               40000
United Kingdom           4               20000
Total                           12              60000

If anyone could set me on the right path I would be very greatful.
0
Comment
Question by:johnaryan
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 5

Author Comment

by:johnaryan
ID: 18749073
I got this idea from another post on EE, it does the job....
I'm still looking the optimal performance out of this., so rather than give myself the points I'm leaving the post open.

My solution......

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:key name="countryKey" match="Table" use="ReportingBusUnitDesc"/>
  <xsl:template match="/">
    <table border="0" cellpadding="0" cellspacing="0" width="600">
      <tr>
        <td class="header">
          Country
        </td>
        <td class="header">
          units
        </td>
        <td class="header">
          revenue
        </td>
      </tr>
      <xsl:call-template name="dataTableGroup" />
      <!-- Sum the Totals of quantities and revenue-->
      <xsl:call-template name="dataTotals" />
    </table>
  </xsl:template>

  <xsl:template name="dataTableGroup">
    <xsl:for-each select="NewDataSet/Table[generate-id()=generate-id(key('countryKey',ReportingBusUnitDesc))]">
      <xsl:sort select="ReportingBusUnitDesc"/>

      <tr>
        <td class="columnTextRight">
          <xsl:value-of select="ReportingBusUnitDesc"/>
        </td>
        <td class="columnData">
          <xsl:value-of select="sum(key('countryKey',ReportingBusUnitDesc)/SystemQuantity)"/>
        </td>
        <td class="columnData">
          $<xsl:value-of select="sum(key('countryKey',ReportingBusUnitDesc)/TotalRevenue)"/>
        </td>
      </tr>

    </xsl:for-each>
  </xsl:template>

  <xsl:template name="dataTotals">
    <tr>
      <td class="footer">
        Total
      </td>
      <td class="footer" >
        <xsl:value-of select="sum(/NewDataSet/Table/SystemQuantity)" />
      </td>
      <td class="footer">
        $<xsl:value-of select="sum(/NewDataSet/Table/TotalRevenue)" />
      </td>
    </tr>
  </xsl:template>
</xsl:stylesheet>
0
 
LVL 60

Accepted Solution

by:
Geert Bormans earned 2000 total points
ID: 18752473
Hi,

there is not too much to add to your solution.
By using key() in the sum, you must have optimised that part: sum(key('countryKey',ReportingBusUnitDesc)/SystemQuantity)

Now to me the most expensive part left, regarding performance is calculating the totals from the entire nodeset.
You could set up a construction that makes an recursive walk through the unique nodes,
keeping a count of the sums... but it make syour code less readible,
and I am not sure you will win a lot... the gain would be very processor dependent

I don't know how big your set is and how much improvement you need.
I always try to write decent code, without optimisations to the extreme, and then make a check first wheither the load requires further optimisations.

Just in case you want to compare

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:key name="countryKey" match="Table" use="ReportingBusUnitDesc"/>
    <xsl:template match="/">
        <table border="0" cellpadding="0" cellspacing="0" width="600">
            <tr>
                <td class="header">
                    Country
                </td>
                <td class="header">
                    units
                </td>
                <td class="header">
                    revenue
                </td>
            </tr>
            <xsl:call-template name="processUniqueTables" >
                <xsl:with-param name="tableNode" select="NewDataSet/Table[generate-id()=generate-id(key('countryKey',ReportingBusUnitDesc)[1])][1]"/>
                <xsl:with-param name="runningTotal">0</xsl:with-param>
                <xsl:with-param name="runningDollarTotal">0</xsl:with-param>
            </xsl:call-template>
        </table>
    </xsl:template>
   
    <xsl:template name="processUniqueTables">
        <xsl:param name="tableNode"/>
        <xsl:param name="runningTotal"/>
        <xsl:param name="runningDollarTotal"/>
       
        <xsl:param name="thisSum" select="sum(key('countryKey',$tableNode/ReportingBusUnitDesc)/SystemQuantity)"></xsl:param>
        <xsl:param name="thisDollarSum" select="sum(key('countryKey',$tableNode/ReportingBusUnitDesc)/TotalRevenue)"/>
        <tr>
            <td class="columnTextRight">
                <xsl:value-of select="$tableNode/ReportingBusUnitDesc"/>
            </td>
            <td class="columnData">
                <xsl:value-of select="$thisSum"/>
            </td>
            <td class="columnData">
                <xsl:text>$</xsl:text>
                <xsl:value-of select="$thisDollarSum"/>
            </td>
        </tr>
        <xsl:choose>
            <xsl:when test="$tableNode/following-sibling::Table[generate-id()=generate-id(key('countryKey',ReportingBusUnitDesc)[1])]">
                <xsl:call-template name="processUniqueTables" >
                    <xsl:with-param name="tableNode" select="$tableNode/following-sibling::Table[generate-id()=generate-id(key('countryKey',ReportingBusUnitDesc)[1])][1]"/>
                    <xsl:with-param name="runningTotal" select="$thisSum + $runningTotal"/>
                    <xsl:with-param name="runningDollarTotal"  select="$thisDollarSum + $runningDollarTotal"/>
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <!-- Sum the Totals of quantities and revenue-->
                <xsl:call-template name="dataTotals" >
                    <xsl:with-param name="total" select="$thisSum + $runningTotal"/>
                    <xsl:with-param name="dollarTotal" select="$thisDollarSum + $runningDollarTotal"/>
                </xsl:call-template>
            </xsl:otherwise>
        </xsl:choose>
       
    </xsl:template>
   
    <xsl:template name="dataTotals">
        <xsl:param name="dollarTotal"/>
        <xsl:param name="total"/>
        <tr>
            <td class="footer">
                Total
            </td>
            <td class="footer" >
                <xsl:value-of select="$total" />
            </td>
            <td class="footer">
                <xsl:text>$</xsl:text>
                <xsl:value-of select="$dollarTotal" />
            </td>
        </tr>
    </xsl:template>
</xsl:stylesheet>

With this solution, you loose the sorting,
because you iterate over the unique tables in document order,
you also loose clarity... it takes an experienced XSLT programmer to see instantly what is happening
but you also loose the access to all Tables starting from the root in the sum
It will depend on your processor implementation whether this would be faster
... but it will be at least a little (measured a difference of 10-15% on Saxon with a small dataset)

cheers

Geert
0
 
LVL 5

Author Comment

by:johnaryan
ID: 18758066
Thanks Geert, I do appreciate the input.
I don't need too much optimizations as it seems the dataset will only be in the order of 10000 rows. And at the moment my sample set of 4000 performs adequately.
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 18758508
cheers
0

Featured Post

Get your Disaster Recovery as a Service basics

Disaster Recovery as a Service is one go-to solution that revolutionizes DR planning. Implementing DRaaS could be an efficient process, easily accessible to non-DR experts. Learn about monitoring, testing, executing failovers and failbacks to ensure a "healthy" DR environment.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

ASP.Net to Oracle Connectivity Recently I had to develop an ASP.NET application connecting to an Oracle database.As I am doing it first time ,I had to solve several problems. This article will help to such developers  to develop an ASP.NET client…
Wouldn’t it be nice if you could test whether an element is contained in an array by using a Contains method just like the one available on List objects? Wouldn’t it be good if you could write code like this? (CODE) In .NET 3.5, this is possible…
Michael from AdRem Software outlines event notifications and Automatic Corrective Actions in network monitoring. Automatic Corrective Actions are scripts, which can automatically run upon discovery of a certain undesirable condition in your network.…
In this brief tutorial Pawel from AdRem Software explains how you can quickly find out which services are running on your network, or what are the IP addresses of servers responsible for each service. Software used is freeware NetCrunch Tools (https…
Suggested Courses

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question