Link to home
Create AccountLog in
Avatar of paul_tipper
paul_tipper

asked on

XSLT 2.0 for PHP

Is there an easy-to-install XSLT 2.0 library available for PHP? I'm currently developing a web application on an Apache/PHP5 platform that makes extensive use of XSL and XSLT transformations. I recently came up against a major performance bottleneck with an XSLT 1.0 transform; I rewrote it in XSLT 2.0, and the execution time went from 30 seconds down to 48 ms! However, the Xalan XSLT library that's installed only supports XSLT 1.0, so I'm trying to find a simple, straightforward way of getting XSLT 2.0 support. After searching the web, the only route I could find was to install the JavaBridge Java/PHP interface and then to install the Java-based Saxon XSLT processor. However, my knowledge and experience of Java is very limited, and I was utterly defeated by all the fiddling required to get JavaBridge to work. In any event, the JavaBridge manual suggests that it is currently experimental, and the development is for a customer's production system, so I'm very reluctant to use infrastructure that's not well-established and stable.

So, what I'm really looking for is XSLT 2.0 support "out-of-the-box" for PHP - if I can't get it, I'll probably have to hand-code the transformation in my back end application, which will be a real pain.

Any suggestions/help greatly appreciated.
Avatar of Gertone (Geert Bormans)
Gertone (Geert Bormans)
Flag of Belgium image

Not straightforward, no.

Saxon is about your option.
There is a Java Version and a .net version.
Maybe you can get the .net version working with a dotnet wrapper

What were you doing in the XSLT1 that makes such a preformance gain in XSLT2?
some grouping and sorting?
With the proper use of keys, you might get the XSLT1 version to 48 ms as well
It might be worthwhile to try to optimise your XSLT1

Bridges to Java and .net are time consuming as well and far from optimal
Avatar of paul_tipper
paul_tipper

ASKER

Thanks for the prompt reply. Yes, the purpose of the transform is to group a list of vehicles into colour, specification and availability type in an intermediary form that can then be represented as a tree. I've attached a sample XML file, the version 1.0 and 2.0 XSLT files and the expected result (for some reason, this site does not allow .xml or .xslt extensions, so I've had to rename them all to be .txt files - rename them back as appropriate). I profiled the version 1.0 XSLT using XMLSpy 2008 Enterprise, and fully 88% of the execution is spent on the following variable assignment shown below. I've also attached the profiling output - again rename the file to profiling.xml. The version 2.0 transform file obviously expresses what I'm trying to do much more succinctly.
<xsl:variable name="availabilityTypeNodes" select="/kmsg/appresponse[@function = 'VT_GetAvailability']/vehicledata/Row[generate-id() = generate-id(key('availtypes', @availtypedesc)[@colourcode = $currentColour][@options = $currentOptions][1])]"/>

Open in new window

testXML.txt
prepareLocatorTree.txt
prepareLocatorTree2.0.txt
XSLOutput.txt
profiling.txt
If you would use statements like this
<xsl:for-each select="key('colours', @colourcode)[generate-id() = generate-id(key('optioncodes', @options)[1])]">
instead of the full path ones you are using,
you can bring the numbers down to what you have with XSLT2
Gertone,

I presume you meant that I should replace the snippet I posted earlier (i.e. the assignment of the 'availabilityTypeNodes' variable) with your statement; I tried this, but although it did indeed run much more quickly, I got a different (presumably incorrect) result set. It looks as if I'll have to go and wreck my head getting back around the whole Muenchian grouping/sorting thing if I'm to redraft the version 1.0 stylesheet so that it runs efficiently AND correctly (I coded the original Muenchian statements using suggestions from various other web sites, and to be honest, I only half understand what really going on in them!).

To restate what I'm trying to do in pseudo-code:

Group all nodes by colour
For each colour
  Generate output colour-level XML nodes

  Group all nodes matching the current colour by specification
  For each specification
     Generate output spec-level XML nodes

     Group all nodes matching the current colour and specification by availability type
     For each availability type
         Generate output availability type XML nodes
     End for each

  End for each

End for each

So the statements that need optimisation are the Muenchian statements. I know I'm going to find it really hard to do this - maybe you could give me some further tips as to how I can redraft these statements?
yes, just replacing with my statement would return incorrect results
I will give it a go to optimise the XSLT1 code properly (returning correct results)
but have to go for a meeting now
this will likely be for tomorrow or the weekend

cheers

Geert
Thanks, Geert. Following your suggestions, I've had a go at redrafting the XSLT 1.0 file to make the Muenchian statements more efficient - the revised version of the XSLT is in the snippet below. The net result is that the execution time as reported by the XMLSpy profiler has come down to 3.9 seconds - a big improvement over the original draft, to be sure, but it's still not nearly as fast as the XSLT 2.0 version of the file, which XMLSpy typically clocks at under 100ms. I'm aware that XMLSpy is not a particularly efficient XSLT processor and that the profiling itself may adversely affect the execution time, so I tried the revised XSLT 1.0 stylesheet on the Linux production server that this will run on, and it clocked in at around 1 second, which again is not as fast as I'd hoped, but may be just about good enough for live usage.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
	<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
	<xsl:key name="colours" match="Row" use="@colourcode"/>
	<xsl:key name="optioncodes" match="Row" use="@options"/>
	<xsl:key name="availtypes" match="Row" use="@availtypedesc"/>
	<xsl:variable name="backorder" select="'0'"/>
	<xsl:variable name="imsfreestock" select="'1'"/>
	<xsl:variable name="imsfreepipeline" select="'2'"/>
	<xsl:variable name="olddealerstock" select="'3'"/>
	<xsl:variable name="newdealerstock" select="'4'"/>
	<xsl:variable name="dealerpipeline" select="'5'"/>
	<xsl:template match="/">
		<TreeRootNode>
			<xsl:for-each select="/kmsg/appresponse[@function = 'VT_GetAvailability']/vehicledata/Row[generate-id(.) = generate-id(key('colours', @colourcode)[1])]">
				<!--Colour-level tree node-->
				<TreeNode>
					<xsl:attribute name="code"><xsl:value-of select="@colourcode"/></xsl:attribute>
					<xsl:attribute name="Title"><xsl:value-of select="@colourcode"/><xsl:text> - </xsl:text><xsl:value-of select="@colourdesc"/></xsl:attribute>
					<xsl:attribute name="class">colournode</xsl:attribute>
					<xsl:variable name="currentColour" select="@colourcode"/>
					<xsl:variable name="specNodes" select="key('colours', $currentColour)[generate-id() = generate-id(key('optioncodes', @options)[@colourcode = $currentColour][1])]"/>
					<xsl:variable name="exactmatches" select="boolean($specNodes[@exactmatch = '1'])"/>
					<xsl:attribute name="exactmatches"><xsl:choose><xsl:when test="$exactmatches">true</xsl:when><xsl:otherwise>false</xsl:otherwise></xsl:choose></xsl:attribute>
					<xsl:for-each select="$specNodes">
						<!--Spec-level tree node-->
						<TreeNode>
							<xsl:attribute name="optioncodes"><xsl:value-of select="@options"/></xsl:attribute>
							<xsl:attribute name="exactmatch"><xsl:value-of select="@exactmatch"/></xsl:attribute>
							<xsl:attribute name="Title"><xsl:value-of select="@specification"/></xsl:attribute>
							<xsl:attribute name="class">specnode</xsl:attribute>
							<xsl:variable name="currentOptions" select="@options"/>
							<xsl:variable name="availabilityTypeNodes" select="key('optioncodes',$currentOptions)[generate-id() = generate-id(key('availtypes', @availtypedesc)[@colourcode = $currentColour][@options = $currentOptions][1])]"/>
							<!-- xsl:variable name="availabilityTypeNodes" select="key('colours', @colourcode)[generate-id() = generate-id(key('optioncodes', @options)[1])]"/ -->
							<xsl:attribute name="imsfreestock"><xsl:choose><xsl:when test="boolean($availabilityTypeNodes[@availtype = '1'])">true</xsl:when><xsl:otherwise>false</xsl:otherwise></xsl:choose></xsl:attribute>
							<xsl:for-each select="$availabilityTypeNodes">
								<xsl:sort select="@availtype"/>
								<!--Availability-type tree node-->
								<TreeNode>
									<xsl:attribute name="vehicles"><xsl:for-each select="key('availtypes', @availtypedesc)[@colourcode = $currentColour][@options = $currentOptions]"><xsl:choose><xsl:when test="position() = 1"><xsl:value-of select="@vehicleid"/></xsl:when><xsl:otherwise><xsl:text>, </xsl:text><xsl:value-of select="@vehicleid"/></xsl:otherwise></xsl:choose></xsl:for-each></xsl:attribute>
									<xsl:attribute name="availtype"><xsl:value-of select="@availtype"/></xsl:attribute>
									<xsl:attribute name="Title"><xsl:value-of select="@availtypedesc"/></xsl:attribute>
									<xsl:attribute name="class">availtypenode</xsl:attribute>
								</TreeNode>
							</xsl:for-each>
							<xsl:if test="@exactmatch = '1' and count($availabilityTypeNodes[@availtype = $imsfreestock or @availtype = $imsfreepipeline or @availtype = $olddealerstock]) = 0">
								<!--The current spec is an exact match, but there is no orderable stock available, so create a back-order line-->
								<xsl:call-template name="backordertreenode"/>
							</xsl:if>
						</TreeNode>
					</xsl:for-each>
					<xsl:if test="not($exactmatches)">
						<!--There were no exact match specs found within the current colour, so create a spec node for the specification required, with a single 'Back Order' availability line-->
						<TreeNode>
							<xsl:attribute name="optioncodes"><xsl:value-of select="/kmsg/appresponse[@function = 'VT_GetAvailability']/specrequired/options"/></xsl:attribute>
							<xsl:attribute name="backorder">1</xsl:attribute>
							<xsl:attribute name="Title"><xsl:value-of select="/kmsg/appresponse[@function = 'VT_GetAvailability']/specrequired/specification"/></xsl:attribute>
							<xsl:attribute name="class">specnode</xsl:attribute>
							<xsl:attribute name="imsfreestock">false</xsl:attribute>
							<!--Create a single 'Back Order' availability type line-->
							<xsl:call-template name="backordertreenode"/>
						</TreeNode>
					</xsl:if>
				</TreeNode>
			</xsl:for-each>
		</TreeRootNode>
	</xsl:template>
	<xsl:template name="backordertreenode">
		<!--Create a 'Back Order' availability type tree node-->
		<TreeNode>
			<xsl:attribute name="availtype"><xsl:value-of select="$backorder"/></xsl:attribute>
			<xsl:attribute name="Title">Back Order</xsl:attribute>
			<xsl:attribute name="class">availtypenode</xsl:attribute>
		</TreeNode>
	</xsl:template>
</xsl:stylesheet>

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of Gertone (Geert Bormans)
Gertone (Geert Bormans)
Flag of Belgium image

Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
Geert,

I used the concatenated key approach you suggested, and that did indeed yield a huge improvement - the profilter now gives a total execute time for the transform of 381ms - still slower than the version 2.0 script, but fully 2 orders of magnitude faster than my original attempt!

Thanks a million for all your help - you are truly The XSLT Man!
Geert,

I used the concatenated key approach you suggested, and that did indeed yield a huge improvement - the profilter now gives a total execute time for the transform of 381ms - still slower than the version 2.0 script, but fully 2 orders of magnitude faster than my original attempt!

Thanks a million for all your help - you are truly The XSLT Man!