[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 3783
  • Last Modified:

Converting XML file to CSV

I've looked high and low for a PHP or Perl script that will take an XML file and turn it into a CSV or other delimited txt file.  It seems as if this XML file was created by Excel, but I can't import it into Excel and I can't figure out how to get it into a readable-usable form.

Thanks!

Here's my XML:
<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:o="urn:schemas-microsoft-com:office:office"
 xmlns:x="urn:schemas-microsoft-com:office:excel"
 xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:html="http://www.w3.org/TR/REC-html40">
 <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
  <LastAuthor>Me</LastAuthor>
  <Created>2006-02-27T19:24:51Z</Created>
  <Version>11.6568</Version>
 </DocumentProperties>
 <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">
  <WindowHeight>6285</WindowHeight>
  <WindowWidth>11340</WindowWidth>
  <WindowTopX>360</WindowTopX>
  <WindowTopY>135</WindowTopY>
  <ProtectStructure>False</ProtectStructure>
  <ProtectWindows>False</ProtectWindows>
 </ExcelWorkbook>
 <Styles>
  <Style ss:ID="Default" ss:Name="Normal">
   <Alignment ss:Vertical="Bottom"/>
   <Borders/>
   <Font/>
   <Interior/>
   <NumberFormat/>
   <Protection/>
  </Style>
  <Style ss:ID="s21">
   <Interior/>
  </Style>
  <Style ss:ID="s22">
   <Font x:Family="Swiss" ss:Color="#FF0000"/>
   <Interior/>
  </Style>
 </Styles>
 <Worksheet ss:Name="_combind">
  <Table ss:ExpandedColumnCount="15" ss:ExpandedRowCount="1326" x:FullColumns="1"
   x:FullRows="1" ss:StyleID="s21">
   <Column ss:StyleID="s21" ss:Width="141"/>
   <Column ss:Index="8" ss:StyleID="s21" ss:Width="194.25"/>
   <Column ss:StyleID="s21" ss:Width="192"/>
   <Column ss:Index="15" ss:StyleID="s21" ss:Width="192"/>
   <Row>
    <Cell><Data ss:Type="String">Full Name (Name Field)</Data></Cell>
    <Cell><Data ss:Type="String">Title</Data></Cell>
    <Cell><Data ss:Type="String">Salutation</Data></Cell>
    <Cell><Data ss:Type="String">Nickname</Data></Cell>
    <Cell><Data ss:Type="String">Last Name</Data></Cell>
    <Cell><Data ss:Type="String">First Name</Data></Cell>
    <Cell><Data ss:Type="String">Middle Initial</Data></Cell>
    <Cell><Data ss:Type="String">Office Name for Address</Data></Cell>
    <Cell><Data ss:Type="String">Member E-mail</Data></Cell>
    <Cell><Data ss:Type="String">Care of</Data></Cell>
    <Cell><Data ss:Type="String">Address</Data></Cell>
    <Cell><Data ss:Type="String">City</Data></Cell>
    <Cell><Data ss:Type="String">State</Data></Cell>
    <Cell><Data ss:Type="String">Zip</Data></Cell>
    <Cell><Data ss:Type="String">E-mail</Data></Cell>
   </Row>
   <Row ss:Hidden="1">
    <Cell><Data ss:Type="String">Gloria</Data></Cell>
    <Cell><Data ss:Type="String">Ms. </Data></Cell>
    <Cell><Data ss:Type="String">Smith</Data></Cell>
    <Cell><Data ss:Type="String">Gloria</Data></Cell>
    <Cell><Data ss:Type="String">Smith</Data></Cell>
    <Cell><Data ss:Type="String">Gloria</Data></Cell>
    <Cell><Data ss:Type="String">R</Data></Cell>
    <Cell><Data ss:Type="String">PROPERTIES</Data></Cell>
    <Cell><Data ss:Type="String">myemail@email.com</Data></Cell>
    <Cell ss:Index="11"><Data ss:Type="String">400 Penn</Data></Cell>
    <Cell><Data ss:Type="String">Whitsett</Data></Cell>
    <Cell><Data ss:Type="String">State</Data></Cell>
    <Cell><Data ss:Type="Number">27111</Data></Cell>
    <Cell><Data ss:Type="String">myemail@email.com</Data></Cell>
   </Row>
    </Table>
  <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
   <Print>
    <ValidPrinterInfo/>
    <HorizontalResolution>-1</HorizontalResolution>
    <VerticalResolution>0</VerticalResolution>
   </Print>
   <Selected/>
   <TopRowVisible>62</TopRowVisible>
   <Panes>
    <Pane>
     <Number>3</Number>
     <ActiveRow>91</ActiveRow>
     <ActiveCol>5</ActiveCol>
    </Pane>
   </Panes>
   <ProtectObjects>False</ProtectObjects>
   <ProtectScenarios>False</ProtectScenarios>
  </WorksheetOptions>
  <Sorting xmlns="urn:schemas-microsoft-com:office:excel">
   <Sort>Last Name</Sort>
  </Sorting>
 </Worksheet>
</Workbook>
0
marcparillo
Asked:
marcparillo
1 Solution
 
nizsmoDeveloperCommented:
Do you really need a PHP or a Perl solution?
Since you can and have saved your XML into a file, you can use a VB solution:
http://www.codeproject.com/vbscript/xml2csv.asp

Otherwise, the closest thing I can find is converting CSV file to XML, maybe you can try and reverse-engineer the code to convert a XML to CSV?
http://www.bytemycode.com/snippets/snippet/448/

Let me know how you go.
0
 
mjcoyneCommented:
If you change the line: <Row ss:Hidden="1"> to: <Row ss:Hidden="0"> or just <row>, it'll import correctly to Excel.  You can also open it as is with Excel, select the whole worksheet, and select Format->Row->Unhide.
0
 
marcparilloAuthor Commented:
Thanks,

mjcoyne -- your solution seems to be the easiest, but it didn't work for me.  Am I importing the data correctly?  I simply opened Excel and clicked File -> Open and then selected the 'myfile.xml' file on my desktop to open.

Is there another way to import an XML file?  I only know how to import a text file into Excel by clicking Data -> Get External Data.
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
mjcoyneCommented:
I cut your data from here, and pasted it in a text document I saved as ee-xml.xml.  I then opened this file in Excel (using Right-click->Send to... or opening the file directly in Excel with File->Open), pressed Ctrl-Home to get to the first cell in the workbook (the Excel cursor is originally in cell F92, as specified by your XML in the <ActiveRow> and <ActiveCol> lines).

At this point, I see just the column header lines.  I also notice that the rows are numbered 1, 3, 4, 5, ..., so row 2 is "missing".  That's because it's hidden, and the instructions I gave before rectify that.  Once unhidden, I can see the data under the column headers.

If you'd rather a Perl script to output a CSV file, try something like this:

#!/usr/bin/perl -w
use strict;
my (%lines, $section);

open (IN, "ee-xml.xml") or die "Can't open input file $!\n";
open (OUT, ">ee-xml.csv") or die "Can't open output file $!\n";

while (<IN>) {
    if (/<Row/) {
        $section++;
    }
    if (/<Row/ .. /<\/Row>/) {
        if (/^\s*<Cell>/) {
            my ($field) = (/<Data.+?\">(.+?)<\/Data>/);
            push @{$lines{$section}}, $field;
        }
    }
}

foreach my $section (sort keys %lines) {
    print OUT join (",", @{$lines{$section}}), "\n";
}
0
 
ozoCommented:
> Here's my XML:
What do you want the corresponding CSV file to look like?
0
 
mjcoyneCommented:
I assumed marcparillo wanted the CSV file to look, once opened in Excel, like what I'm seeing when I open the xml file directly in Excel, so that's what my script does.
0
 
marcparilloAuthor Commented:
It worked!

mjcoyne -- I didn't realize it was that easy (now I feel stupid).  The "hidden" variable is what was screwing it up... and I'll keep your Perl script just in case I need to do it without Excel.

Thanks again!!

0
 
mjcoyneCommented:
No need to feel stupid, marcparillo.  That had me stumped for a bit, too -- until I happened to notice the missing second row...:)
0

Featured Post

[Webinar] Cloud and Mobile-First Strategy

Maybe you’ve fully adopted the cloud since the beginning. Or maybe you started with on-prem resources but are pursuing a “cloud and mobile first” strategy. Getting to that end state has its challenges. Discover how to build out a 100% cloud and mobile IT strategy in this webinar.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now