qz8dsw
asked on
Moving around in XML using XSL
Hi all,
I am attempting to transform an XML file to CSV using XSL.
As you can see in the attached XML we have a file node we have to go through. (although that sample XML does not contain more than one file node there can be).
From there I need the file key and file_name.
Then for each file node I need to go down and do a for each on the document nodes grabbing key, duplex, envelope_number values and also start_page and page_count, also grabbing values from all the nodes in the print node.
I am hoping for output like the following.
I have got part way there, but now I'm having trouble going into the document nodes of each file node.
Any help you could provide would be appreciated.
Statements-20140225131628-index.xml
I am attempting to transform an XML file to CSV using XSL.
As you can see in the attached XML we have a file node we have to go through. (although that sample XML does not contain more than one file node there can be).
From there I need the file key and file_name.
Then for each file node I need to go down and do a for each on the document nodes grabbing key, duplex, envelope_number values and also start_page and page_count, also grabbing values from all the nodes in the print node.
I am hoping for output like the following.
file,file_name,document_key,duplex,envelope_number,start_page,page_count,perf_sheet,add_name,add_1,add_2,add_3,add_4,add_5,add_6
1,Statements-20140225131628-1.pdf,1,false,1,1,1,1,Test User 1,4 abc Drive,somewhere,Someplace,SomeHow,,
1,Statements-20140225131628-1.pdf,1b,true,1,2,3,1,Test User 1,4 abc Drive,somewhere,Someplace,SomeHow,,
1,Statements-20140225131628-1.pdf,2,false,2,5,1,1,Test User 2,12 abc Drive,somewhere else,Someplace else,Somewho,,
1,Statements-20140225131628-1.pdf,2b,true,2,6,2,1,Test User 2,12 abc Drive,somewhere else,Someplace else,Somewho,,
I have got part way there, but now I'm having trouble going into the document nodes of each file node.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:apply-templates select="/root/files/file"/>
</xsl:template>
<xsl:template match="/root/files/file">
<xsl:value-of select="'file_number,file_name,document_number,duplex,envelope_number,start_page,page_count'"/>
<xsl:text>
</xsl:text>
<xsl:for-each select="/root/files/file/documents/document">
<xsl:for-each select="/root/files/file">
<xsl:value-of select="@key"/>
<xsl:value-of select="','"/>
<xsl:value-of select="file_name"/>
<xsl:value-of select="','"/>
Here is the problem, I'm currently in the file node
<xsl:value-of select="@key"/>
<xsl:value-of select="','"/>
<xsl:value-of select="@duplex"/>
<xsl:value-of select="','"/>
<xsl:value-of select="@envelope_number"/>
<xsl:value-of select="','"/>
<xsl:value-of select="start_page"/>
<xsl:value-of select="','"/>
<xsl:value-of select="page_count"/>
<xsl:value-of select="','"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:for-each>
<xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>
Any help you could provide would be appreciated.
Statements-20140225131628-index.xml
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks VERY much Geert, that has helped immensely.
I have new code based on yours and adding in some other parts of the XML I need, however I have noticed when adding a second file node with different values although it is extracted fine the title detailing the column names is repeated
Here is my new code.
(Would you suggest to go to the bottom level (print) and then use ancestor:: to get back to document).
I realise why the XSL is doing what it is doing, because it has found a new file node however I have been thrown in the deep end as such so any advice would be appreciated re best practices and also the title repeating on subsequent file nodes if they exist,
I have new code based on yours and adding in some other parts of the XML I need, however I have noticed when adding a second file node with different values although it is extracted fine the title detailing the column names is repeated
Here is my new code.
(Would you suggest to go to the bottom level (print) and then use ancestor:: to get back to document).
I realise why the XSL is doing what it is doing, because it has found a new file node however I have been thrown in the deep end as such so any advice would be appreciated re best practices and also the title repeating on subsequent file nodes if they exist,
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:apply-templates select="/root/files/file"/>
</xsl:template>
<xsl:template match="/root/files/file">
<xsl:value-of select="'file_number,file_name,document_number,duplex,envelope_number,start_page,page_count'"/>
<xsl:text>
</xsl:text>
<xsl:apply-templates select="documents/document"/>
<xsl:text>
</xsl:text>
</xsl:template>
<xsl:template match="document">
<xsl:value-of select="ancestor::file/@key"/>
<xsl:value-of select="','"/>
<xsl:value-of select="ancestor::file/file_name"/>
<xsl:value-of select="','"/>
<xsl:value-of select="@key"/>
<xsl:value-of select="','"/>
<xsl:value-of select="@duplex"/>
<xsl:value-of select="','"/>
<xsl:value-of select="@envelope_number"/>
<xsl:value-of select="','"/>
<xsl:value-of select="start_page"/>
<xsl:value-of select="','"/>
<xsl:value-of select="page_count"/>
<xsl:value-of select="','"/>
<xsl:value-of select="print/perf_sheet"/>
<xsl:value-of select="','"/>
<xsl:value-of select="print/insert_bin_1"/>
<xsl:value-of select="','"/>
<xsl:value-of select="print/insert_bin_2"/>
<xsl:value-of select="','"/>
<xsl:value-of select="print/insert_bin_3"/>
<xsl:value-of select="','"/>
<xsl:value-of select="print/insert_bin_4"/>
<xsl:value-of select="','"/>
<xsl:value-of select="print/insert_bin_5"/>
<xsl:value-of select="','"/>
<xsl:value-of select="print/insert_bin_6"/>
<xsl:value-of select="','"/>
<xsl:value-of select="print/add_name"/>
<xsl:value-of select="','"/>
<xsl:value-of select="print/add_1"/>
<xsl:value-of select="','"/>
<xsl:value-of select="print/add_2"/>
<xsl:value-of select="','"/>
<xsl:value-of select="print/add_3"/>
<xsl:value-of select="','"/>
<xsl:value-of select="print/add_4"/>
<xsl:value-of select="','"/>
<xsl:value-of select="print/add_5"/>
<xsl:value-of select="','"/>
<xsl:value-of select="print/add_6"/>
<xsl:value-of select="','"/>
<xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>
ASKER
Attached is a new XML file with multiple file nodes.
Statements-20140225131628-index.xml
Statements-20140225131628-index.xml
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thank you VERY much for your help and patience.
ASKER
Fast and great answer.
Thank you VERY much, by this simple effort by you have have learned quite a lot.
Thank you VERY much, by this simple effort by you have have learned quite a lot.
ASKER
Ahhh, thank you Mccarl.
Yes, Geert provided the majority of the solution to my issue and some clarification.
Thank you BOTH for your help on this at such short notice.
Yes, Geert provided the majority of the solution to my issue and some clarification.
Thank you BOTH for your help on this at such short notice.
Welcome,
(I went to bed before your follow up came in, that is why I left it)
@mccarl, thanks for stepping in, both technically and administratively
(I went to bed before your follow up came in, that is why I left it)
@mccarl, thanks for stepping in, both technically and administratively
No worries, glad to help! :)
If I do that, I transform to html tables instead
If I do so, I don't have to worry about character encoding and I don't have to take care of potential " or newlines inside datafields
If you transform to HTML table instead of csv, and you name the output file with .xls extension, excel will import it with no worries
Just in case you are aiming for excel or similar, don't do CSV
but the comment about digging for the deepest repetitive element, still holds of course