Link to home
Start Free TrialLog in
Avatar of sunil1982
sunil1982Flag for India

asked on

Converting XML Files to CSV Format using XSLT

Hi There,

I am looking for XSLT Solution for a Windows PC to convert XML Files containing thouasands of lines like the one given below.Each line is one record.

XML File  : 1 Line-Sample
******************************

<NexusDataRecord><SMS_CDR version="1.4" sequenceNumber="209166705"><Id>45824079985</Id><Number><B>978895583220</B></Number><Time><FirstActivity>1258463045.174</FirstActivity><Setup>1258463045.174</Setup><Elapsed>37172</Elapsed><LastActivity>1258463082.346</LastActivity><Portion><Clear>1258463082.346</Clear></Portion></Time><CLRInfo><CLR><CLR_TYPE>DTAP SMS MO</CLR_TYPE><Time><FirstActivity>1258463045.174</FirstActivity><Setup>1258463045.174</Setup><Clear>1258463082.346</Clear><LastActivity>1258463082.346</LastActivity></Time><GlobalTitle></GlobalTitle><PointCode><OPC>221</OPC><DPC>7751</DPC></PointCode><ExchCC></ExchCC><TransactId></TransactId><SubsystemNumber></SubsystemNumber><NI>2</NI><DNI>2</DNI><TMSI>2885909022</TMSI><IMEISV>4423560136837812</IMEISV><CGI><NEW><MCCMNC>97160</MCCMNC><LAC>20116</LAC><CI>22451</CI></NEW></CGI></CLR></CLRInfo></SMS_CDR></NexusDataRecord>
Avatar of zc2
zc2
Flag of United States of America image

The solution depends much how strict the order of the nodes in the hierarchy.
Below is a rough solution to "flat" all the values to a CSV line.  If each NexusDataRecord record has different structure it should be done by completely different way.
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
	<xsl:output method="text" encoding="iso-8859-1"/>
	<xsl:template match="/">
		<xsl:apply-templates select="//NexusDataRecord[1]" mode="head"/>
		<xsl:apply-templates select="//NexusDataRecord"/>
	</xsl:template>

	<xsl:template match="NexusDataRecord" mode="head">
		<xsl:apply-templates select="*" mode="head"/><xsl:text>&#x0A;</xsl:text>
	</xsl:template>

	<xsl:template match="*" mode="head">
		<xsl:apply-templates select="@*"  mode="head"/>
		<xsl:apply-templates select="*|text()"  mode="head"/>
 	</xsl:template>
	
	<xsl:template match="@*" mode="head">
		<xsl:value-of select="name()"/><xsl:text>, </xsl:text>
 	</xsl:template>
	<xsl:template match="text()" mode="head">
		<xsl:value-of select="name(..)"/><xsl:text>, </xsl:text>
 	</xsl:template>


	<xsl:template match="NexusDataRecord">
		<xsl:apply-templates select="*"/><xsl:text>&#x0A;</xsl:text>
	</xsl:template>
	
	<xsl:template match="*">
		<xsl:apply-templates select="@*"/>
		<xsl:apply-templates select="*|text()"/>
 	</xsl:template>
	
	<xsl:template match="@*|text()">
		<xsl:value-of select="."/><xsl:text>, </xsl:text>
 	</xsl:template>
	
</xsl:stylesheet>

Open in new window

Avatar of sunil1982

ASKER

Thanks,

All the records are having same structure, Only some fields may be null in other records.
Sorry to bother,

How can I install the XSLT program (Command Line) and use your code in my windows pc.

Rgds



Hi ,

I got the EXE file MS Website.

Now I have the DATA.xml file and MSXSL application, if possoble can you give me the exact commadn line I should use (Including your code) to convert the XML File to CSV Format.

Rgds



**********************************************************************
E:\>msxsl/?
Microsoft (R) XSLT Processor Version 4.0

Usage: MSXSL source stylesheet [options] [param=value...] [xmlns:prefix=uri...]

Options:
    -?            Show this message
    -o filename   Write output to named file
    -m startMode  Start the transform in this mode
    -xw           Strip non-significant whitespace from source and stylesheet
    -xe           Do not resolve external definitions during parse phase
    -v            Validate documents during parse phase
    -t            Show load and transformation timings
    -pi           Get stylesheet URL from xml-stylesheet PI in source document
    -u version    Use a specific version of MSXML: '2.6', '3.0', '4.0'
    -             Dash used as source argument loads XML from stdin
    -             Dash used as stylesheet argument loads XSL from stdin

E:\>
Let's suppose, you have stored the XSLT code to the file named tocsv.xsl,
then call the processor by the following command:
msxsl DATA.xml tocsv.xsl -o data.csv
I have tried as follows.
Your code is in side the SMS.xsl file.

**************************************************************
E:\>msxsl.exe SMS.xml SMS.xsl -o Out1.csv

Error occurred while parsing document.

Code:   0xc00ce555
URL:    file:///E:/SMS.xml
Line:   2
Column: 2
Only one top level element is allowed in an XML document.

E:\>dir *.x*l
 Volume in drive E is Data
 Volume Serial Number is 1CE1-695A

 Directory of E:\

12/18/2009  09:07 PM       152,525,855 SMS.xml
12/18/2009  09:09 PM             1,607 SMS.xsl
               2 File(s)    152,527,462 bytes
               0 Dir(s)  103,716,519,936 bytes free

E:\>
SMS.xsl
I guess, the problem is with the XML. Make sure all the record nodes are children of the single root node (its name does not matter), like in below:
<?xml version="1.0"?>
<root>
	<NexusDataRecord>
		.
		.
		.
	</NexusDataRecord>
	<NexusDataRecord>
		.
		.
		.
	</NexusDataRecord>
	<NexusDataRecord>
		.
		.
		.
	</NexusDataRecord>
</root>

Open in new window

Hi Zc2,

Yes thats true.

My XML File structure is like below.Is it possible for you to add some further funtion in your code to consider the situation.

***********************************************

<NexusDataRecord>
                .
                .
                .
        </NexusDataRecord>
        <NexusDataRecord>
                .
                .
                .
        </NexusDataRecord>
        <NexusDataRecord>
                .
                .
                .
        </NexusDataRecord>

I'm sorry, but that's impossible. The XML you provided to processing must be well-formed. An XML without the root node is not well formed and can't be processed by an XSLT.
You have two options:
1) Open the xml in text editor and embrace the nodes with the root node. Or do the same programmatically, if you have many files.
2) Use some different language, (JavaScript for instance) load the data file, extract the NexusDataRecord elements one by one and pass them to XML process. Then apply XSLT (it'd be little different than that I provided) on each. Append the result to the output CSV file.
Hi Zc2,

For option 1 , can you provide me a BAT File code so that I can insert a <root> and </root> at the end of each XM File.

Regards
Try the batch below (store it to a .bat file):
the first parameter - the name of the original file, the second the name of the output file.
echo ^<?xml version="1.0"?^> >%2
echo ^<root^> >>%2
type %1 >>%2
echo ^</root^> >>%2

Open in new window

I have coded as follows.

echo ^<?xml version="1.0"?^> >%SMS_20091218_175558_633_2.xml
echo ^<root^> >>%SMS_20091218_175558_633_2.xml
type %SMS_20091218_175558_633.xml >>%SMS_20091218_175558_633_2.xml
echo ^</root^> >>%SMS_20091218_175558_633_2.xml


Out put file came propoerly, but contains only 3 lines as follows.

<?xml version="1.0"?>  
<root>  
</root>

Is it possible to make the coding to take any *.xml file and readout to *_2.xml file.

Sorry top ask too many questions, I dont have much coding knowledge.

Rgds



I have tried following code it works fine...

echo ^<?xml version="1.0"?^> >%SMS_20091218_175558_633_2.xml
echo ^<root^> >>%SMS_20091218_175558_633_2.xml
type SMS_20091218_175558_633.xml >>%SMS_20091218_175558_633_2.xml
echo ^</root^> >>%SMS_20091218_175558_633_2.xml

But Is it possible to atuomate the file naming (in & OUT) ?

Rgds

the batch bellow accepts only one parameter as thi input file name, but creates the output file with the suffix "_2"
set out=%~n1_2.xml
echo Input: %1, Output: %out%

echo ^<?xml version="1.0"?^> >%out%
echo ^<root^> >>%out%
type %1 >>%out%
echo ^</root^> >>%out%

Open in new window

I wonder how you code works, it's invalid.
%SMS_20091218_175558_633_2.xml should not contain the '%' character.
Hi There,

Thanks for your help.

The csv conversion is happening well. But in some of the record I have a field called
<Number><A>978895583220</A></Number> but this field is not in all the record, similiary there are opther few fields.Can you provide me direct contact, so that I can pass the Sample xml file and I have XSD file also for the same.

And XML Schema file also.

Rgds
Please extract (using a text editor) all the different kind of records ant post them here.
Due to security reasons I am not posting the actual records.But attaching the XSD and Schema file for you reference.

Rgds


DataExport-5.3.3-sms-only.xsd
SMS-config.xml
if the record structure vary, then (as I mentioned in the first comment) you need a different solution.
Below is a sample code to create the first six fields of each record. Please finish it up by example.
In this solution the data each field is taken by a corresponding XPath expression.
<xsl:for-each select="SMS_CDR"> there just to change the current element not repeat the SMS_CDR in each XPath.
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
	<xsl:output method="text" encoding="iso-8859-1"/>

	<xsl:template match="/">
		<xsl:text>sequenceNumber, Id, Number/A, Number/B, CLR_TYPE, FirstActivity&#x0A;</xsl:text>

		<xsl:apply-templates select="//NexusDataRecord"/>
	</xsl:template>

	<xsl:template match="NexusDataRecord">
       <xsl:for-each select="SMS_CDR">
	                              <xsl:value-of select="@sequenceNumber"/>
	       <xsl:text>, </xsl:text><xsl:value-of select="Id"/>
	       <xsl:text>, </xsl:text><xsl:value-of select="Number/A"/>
	       <xsl:text>, </xsl:text><xsl:value-of select="Number/B"/>
	       <xsl:text>, </xsl:text><xsl:value-of select="CLRInfo/CLR/CLR_TYPE"/>
	       <xsl:text>, </xsl:text><xsl:value-of select="CLRInfo/CLR/Time/FirstActivity"/>
	       <!-- and so on -->
       </xsl:for-each>
       <xsl:text>&#x0A;</xsl:text>
	</xsl:template>
</xsl:stylesheet>

Open in new window

Hi I have tried the following code, But I get only the fieldsvalue which you have created, any wrong doing in the code ?


<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
        <xsl:output method="text" encoding="iso-8859-1"/>
 
        <xsl:template match="/">
                <xsl:text>sequenceNumber, Id, Number/A, Number/B, CLR_TYPE, FirstActivity, Setup, Elapsed, LastActivity, Clear, CLR_TYPE, FirstActivity, Setup, Clear, LastActivity, OPC, DPC, NI, DNI, TMSI, IMEISV, MCCMNC, LAC, CI, CLR_TYPE, FirstActivity, Setup, Clear, LastActivity, A_GT, B_GT, OPC, DPC, Orig, ASSN, BSSN, NI, DNI, IMSI, MSISDN, SMSCaddr, MSCaddr, CLR_TYPE, FirstActivity, Setup, Clear, LastActivity, A_GT, B_GT, OPC, DPC, Orig, ASSN, BSSN, NI, DNI, IMSI, SMSCaddr, CLR_TYPE, FirstActivity, Setup, Clear, LastActivity, OPC, DPC, NI, DNI, TMSI, IMSI, MCCMNC, LAC, SAC&#x0A;</xsl:text>
 
                <xsl:apply-templates select="//NexusDataRecord"/>
        </xsl:template>
 
        <xsl:template match="NexusDataRecord">
       <xsl:for-each select="SMS_CDR">
                                      <xsl:value-of select="@sequenceNumber"/>
               <xsl:text>, </xsl:text><xsl:value-of select="Id"/>
               <xsl:text>, </xsl:text><xsl:value-of select="Number/A"/>
               <xsl:text>, </xsl:text><xsl:value-of select="Number/B"/>
               <xsl:text>, </xsl:text><xsl:value-of select="CLRInfo/CLR/CLR_TYPE"/>
               <xsl:text>, </xsl:text><xsl:value-of select="CLRInfo/CLR/Time/FirstActivity"/>
               <xsl:text>, </xsl:text><xsl:value-of select=" Setup"/>
               <xsl:text>, </xsl:text><xsl:value-of select=" Elapsed"/>
               <xsl:text>, </xsl:text><xsl:value-of select=" LastActivity"/>
<xsl:text>, </xsl:text><xsl:value-of select=" Clear"/>
<xsl:text>, </xsl:text><xsl:value-of select=" CLR_TYPE"/>
<xsl:text>, </xsl:text><xsl:value-of select=" FirstActivity"/>
<xsl:text>, </xsl:text><xsl:value-of select=" Setup"/>
<xsl:text>, </xsl:text><xsl:value-of select=" Clear"/>
<xsl:text>, </xsl:text><xsl:value-of select=" LastActivity"/>
<xsl:text>, </xsl:text><xsl:value-of select=" OPC"/>
<xsl:text>, </xsl:text><xsl:value-of select=" DPC"/>
<xsl:text>, </xsl:text><xsl:value-of select=" NI"/>
<xsl:text>, </xsl:text><xsl:value-of select=" DNI"/>
<xsl:text>, </xsl:text><xsl:value-of select=" TMSI"/>
<xsl:text>, </xsl:text><xsl:value-of select=" IMEISV"/>
<xsl:text>, </xsl:text><xsl:value-of select=" MCCMNC"/>
<xsl:text>, </xsl:text><xsl:value-of select=" LAC"/>
<xsl:text>, </xsl:text><xsl:value-of select=" CI"/>
<xsl:text>, </xsl:text><xsl:value-of select=" CLR_TYPE"/>
<xsl:text>, </xsl:text><xsl:value-of select=" FirstActivity"/>
<xsl:text>, </xsl:text><xsl:value-of select=" Setup"/>
<xsl:text>, </xsl:text><xsl:value-of select=" Clear"/>
<xsl:text>, </xsl:text><xsl:value-of select=" LastActivity"/>
<xsl:text>, </xsl:text><xsl:value-of select=" A_GT"/>
<xsl:text>, </xsl:text><xsl:value-of select=" B_GT"/>
<xsl:text>, </xsl:text><xsl:value-of select=" OPC"/>
<xsl:text>, </xsl:text><xsl:value-of select=" DPC"/>
<xsl:text>, </xsl:text><xsl:value-of select=" Orig"/>
<xsl:text>, </xsl:text><xsl:value-of select=" ASSN"/>
<xsl:text>, </xsl:text><xsl:value-of select=" BSSN"/>
<xsl:text>, </xsl:text><xsl:value-of select=" NI"/>
<xsl:text>, </xsl:text><xsl:value-of select=" DNI"/>
<xsl:text>, </xsl:text><xsl:value-of select=" IMSI"/>
<xsl:text>, </xsl:text><xsl:value-of select=" MSISDN"/>
<xsl:text>, </xsl:text><xsl:value-of select=" SMSCaddr"/>
<xsl:text>, </xsl:text><xsl:value-of select=" MSCaddr"/>
<xsl:text>, </xsl:text><xsl:value-of select=" CLR_TYPE"/>
<xsl:text>, </xsl:text><xsl:value-of select=" FirstActivity"/>
<xsl:text>, </xsl:text><xsl:value-of select=" Setup"/>
<xsl:text>, </xsl:text><xsl:value-of select=" Clear"/>
<xsl:text>, </xsl:text><xsl:value-of select=" LastActivity"/>
<xsl:text>, </xsl:text><xsl:value-of select=" A_GT"/>
<xsl:text>, </xsl:text><xsl:value-of select=" B_GT"/>
<xsl:text>, </xsl:text><xsl:value-of select=" OPC"/>
<xsl:text>, </xsl:text><xsl:value-of select=" DPC"/>
<xsl:text>, </xsl:text><xsl:value-of select=" Orig"/>
<xsl:text>, </xsl:text><xsl:value-of select=" ASSN"/>
<xsl:text>, </xsl:text><xsl:value-of select=" BSSN"/>
<xsl:text>, </xsl:text><xsl:value-of select=" NI"/>
<xsl:text>, </xsl:text><xsl:value-of select=" DNI"/>
<xsl:text>, </xsl:text><xsl:value-of select=" IMSI"/>
<xsl:text>, </xsl:text><xsl:value-of select=" SMSCaddr"/>
<xsl:text>, </xsl:text><xsl:value-of select=" CLR_TYPE"/>
<xsl:text>, </xsl:text><xsl:value-of select=" FirstActivity"/>
<xsl:text>, </xsl:text><xsl:value-of select=" Setup"/>
<xsl:text>, </xsl:text><xsl:value-of select=" Clear"/>
<xsl:text>, </xsl:text><xsl:value-of select=" LastActivity"/>
<xsl:text>, </xsl:text><xsl:value-of select=" OPC"/>
<xsl:text>, </xsl:text><xsl:value-of select=" DPC"/>
<xsl:text>, </xsl:text><xsl:value-of select=" NI"/>
<xsl:text>, </xsl:text><xsl:value-of select=" DNI"/>
<xsl:text>, </xsl:text><xsl:value-of select=" TMSI"/>
<xsl:text>, </xsl:text><xsl:value-of select=" IMSI"/>
<xsl:text>, </xsl:text><xsl:value-of select=" MCCMNC"/>
<xsl:text>, </xsl:text><xsl:value-of select=" LAC"/>
<xsl:text>, </xsl:text><xsl:value-of select=" SAC"/>
       </xsl:for-each>
       <xsl:text>&#x0A;</xsl:text>
        </xsl:template>
</xsl:stylesheet>
You need to specify path from the current node to each desired node, like you do with the file system.
By using <xsl:for-each select="SMS_CDR">,  I made the SMS_CDR element "current".
You need to provide the path from that element (SMS_CDR) to each other, like

<xsl:text>, </xsl:text><xsl:value-of select="CLRInfo/CLR/Time/Setup"/>
<xsl:text>, </xsl:text><xsl:value-of select="CLRInfo/CLR/Time/Elapsed"/>
<xsl:text>, </xsl:text><xsl:value-of select="CLRInfo/CLR/Time/LastActivity"/>
<xsl:text>, </xsl:text><xsl:value-of select="CLRInfo/CLR/Time/Portion/Clear"/>
and so on.

Note, my XPath strings use "/", yours do not. That the difference.
Hi Zc2,

I tried a lot but sorry I am not good in coding and the result is not the desired one.
If possible can you suggest me some program, I tried Oxygen. Frankly I am not sure what to do in that.

Rgds

I use a tool called XTrans (http://www.simxtech.com/users/zc2/xtrans/). I wrote it myself.
In it I load the XML, the XSL, then press Ctrl-T, and here is the transformation result.
Hi Zc2,

I have donloded & intalled Microsoft Visual C++ 2008 and whne I tried to open the XTRans.exe file it shows up "The application failed to start...."

Rgds
I guess, you downloaded a wrong version. You need for VC++ 2005, v8.0.50727.4053
(That much possible my fault, so I'm sorry)

I've put the correct version of the redistribution package on my site:
http://www.simxtech.com/users/zc2/xtrans/vcredist_x86.exe

Hi Zc2,

Many thanks for your great helps.

I have installed the xtrans and trying to iprove the code.
One done I will update you.

Rgds
Hi ZC2 ,

I am using the XTRANS now, Looking good.

But if you can refer the XML File, each record contains 2/3 subrecord.
When I open it in Excel they appear as different rows.

During XSL Development I tried to put all the sub record in to one Line.
But I ended up in repeating the same record again and again in the same line.

Is it possible to update the code to make individual line for each SUB-Record as EXCEL does ?

Rgds



TestCode11.xml
SMS2111.xsl
Yes, I noticed that. But how exactly you want to place the data from many sub records to just one record?
Hi Zc2,

No probs, even If I can do exactly like Excel does , then I can manage it...

Rgds
Hi ZC2,

Any Update ?
please, tell me the difficulties you currently need help with.
HI Zc2,

My Original Request is :-

In one of the record there are 2/3 sub records. But when I used the XSL code, it always works out the first one and leaves the 2nd and 3rd Records.It will helpful to get theem too in the same output line.

One more request is :-

Just oppposite to above,

When we open the XML File in Excel, It creates a new line for each sub record.
If I can get the out lines as exactly excel shows then that will help me in some other aspects too.

Regards

Sunil

sunil_almnthat@yahoo.com
SOLUTION
Avatar of zc2
zc2
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hi zc2,

Allow me 3/4 days to test this and provide a feedback.
Unfortunately I am busy with some other testing.

Regards

Sunil
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial