Solved

Converting XML to CSV with PHP

Posted on 2013-01-04
10
1,364 Views
Last Modified: 2013-01-07
I need to produce a PHP script (or find a function that can be called within the script) to regularly convert an XML file to CSV, examples of which are attached.

I have started looking at converting the XML to an object using PHP 5 simplexml_load_file, and then iterating through to write values to a CSV file using fputcsv.

However, I've hit three snags that I'm not sure how to go about solving -

The number of rows written to the CSV are dictated by the number of '<message>' fields in the xml document, but using a 'foreach' argument based on <message> would then need to be able to retrieve values from parent fields of that message (and children of fields on the same level of hierarchy as <message>.
Some fields are not included for all <satellitePass> (see second row in CSV for bird 2) - Not sure how to ensure these are consistently represented by an empty value in the CSV without having to do a check for each possible value to see if it is present.
Field keys need to be converted to header names for the first row of the CSV - This is not vital - they shouldn't change so could be pre-written to the CSV before parsing the rest of XML.

Any tips on the best (most sane) way to approach this would be gratefully received - I hope there are some shortcuts I'm not aware of rather than a very dense script full of deeply nested 'if' arguments to check the possible existence of each field.

Thanks.
example-XML-input.xml
example-required-CSV-output.csv
0
Comment
Question by:williamskellorn
  • 5
  • 3
  • 2
10 Comments
 
LVL 60

Assisted Solution

by:Geert Bormans
Geert Bormans earned 475 total points
ID: 38744465
In my experience of transforming the most complex and nested XML into CSV, I have always found that
- precooked tools fail as soon as some complexity comes in
- XSLT usually gives all the required flexibility you need
I would suggest (given teh XML and the required result CSV) that you go down that route,
have an XSLT that transforms your XML directly in the CSV you need. It is pretty easy to execute an XSLT inside a PHP script, so you are safe

How flexible to you need this to be? I can cook you an XSLT if you want

I have answered a bunch of questions on EE already dealing with XML, XSLT and CSV, you might find some inspiration there
0
 

Author Comment

by:williamskellorn
ID: 38744503
Hi Gertone,

Thanks for the speedy answer!

The only varying factors are likely to be number of platforms, number of satellite passes per platform, and number of messages per satellite - so I think a bespoke XSLT script is looking like a good option.

Hadn't come across XSLT before, I'll dig into it (and your other posts) - but if it's a quickie for you to put together something that does the job, or gets me on the way, that would be fantastic!
0
 
LVL 60

Accepted Solution

by:
Geert Bormans earned 475 total points
ID: 38745422
I think I covered the flexibility you need, yet I did not over-engineered stuff, so quite a bit is hard coded.... but I think this will work for you
XML-2-CSV.xsl
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 38745426
Let me know if you don't know how to initiate an XSLT in PHP
0
 
LVL 108

Assisted Solution

by:Ray Paseur
Ray Paseur earned 25 total points
ID: 38746560
Gertone is usually pretty good with this stuff, and if you like his solution, go with it.  But I am moved to ask, "Why are you doing this at all?"  XML is a hierarchical document (and in counting the curly-braces from the object, it looks to me like you have ten levels of hierarchy here), and a CSV is a two-dimensional tabular representation of data like a spreadsheet.  There is not really a good way to represent the hierarchy in CSV, and such a transformation usually results in an unwieldy amount of repetition.  Maybe if you can tell us a little more about where the data comes from and how it needs to be consumed, we might be able to suggest an alternate scheme.
0
IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 60

Assisted Solution

by:Geert Bormans
Geert Bormans earned 475 total points
ID: 38747170
Hi Ray, that is an interesting question and I can not answer it for williamskellorn of course.
But it is a request I often get outside EE... and in my conversations with customers for XML2CSV stuff, it usually boils down to the same reason: there is a system that generates hierarchical reports in XML and a human wants to verify the reports in Excel. And allthought there is an awfull lot of repetition (in this case I believe the number of value fields in the table are about double the fields from the XML) it makes sense in some way.
So rather than asking the question "why?", I tend to think, "spreadsheet visualisation" for sure.

But you are very right to ask this question. If the purpose of this exercise would be to load the result CSV in a one table relational database, we should think about a different multi table approach (to name one thing)
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 38747337
Yep, Your assumptions are probably correct and that's right where I am going with the question!
0
 

Author Comment

by:williamskellorn
ID: 38750186
Gertone and Ray - Thank you both for thorough and rigourous answers to the initial question.

To fill in a little of the reasoning - data currently is provided via daily CSV attachments and these are stored as well as being parsed into a single table mySQL database. Currently fewer messages are delivered as there is an algorithm in effect that chooses the 'best' message for a satelite pass. We can move to retrieve the data via a webservice which returns octet stream xml, in the form shown in the example.

In the short-term we need to replicate the original workflow as closely as possible so that the data may still be stored as intermediate CSV files, mirroring the current workflow while we run both workflows in tandem.

Once we have moved over to the webservice alone, I would then potentially look to remove the intermediate storage step and write the xml directly to the database. Either way, the reduction of the number of messages to pick the best will need to be replicated our end, thus a single table approach will probably still be sufficient.

I'll be back on this project tomorrow and will look at initiating the XSLT script you have kindly provided via PHP then, as well as consider the flexibility versus hard-coded trade-off.

I'll mark up Gertone's solution providing the XSLT as accepted now though - didn't want to make it look like I wasn't grateful!
0
 

Author Closing Comment

by:williamskellorn
ID: 38750206
Very impressed by the swiftness, thought and effort afforded to my first question on EE.

My thanks to Gertone and Ray_Paseur!
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 38750224
welcome,
don't hesitate to come back here if you need changes to the XSLT, explanation of what it does and how it does it, or help initiating it from PHP
0

Featured Post

Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

I imagine that there are some, like me, who require a way of getting currency exchange rates for implementation in web project from time to time, so I thought I would share a solution that I have developed for this purpose. It turns out that Yaho…
Since pre-biblical times, humans have sought ways to keep secrets, and share the secrets selectively.  This article explores the ways PHP can be used to hide and encrypt information.
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now