Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1522
  • Last Modified:

Converting XML to CSV with PHP

I need to produce a PHP script (or find a function that can be called within the script) to regularly convert an XML file to CSV, examples of which are attached.

I have started looking at converting the XML to an object using PHP 5 simplexml_load_file, and then iterating through to write values to a CSV file using fputcsv.

However, I've hit three snags that I'm not sure how to go about solving -

The number of rows written to the CSV are dictated by the number of '<message>' fields in the xml document, but using a 'foreach' argument based on <message> would then need to be able to retrieve values from parent fields of that message (and children of fields on the same level of hierarchy as <message>.
Some fields are not included for all <satellitePass> (see second row in CSV for bird 2) - Not sure how to ensure these are consistently represented by an empty value in the CSV without having to do a check for each possible value to see if it is present.
Field keys need to be converted to header names for the first row of the CSV - This is not vital - they shouldn't change so could be pre-written to the CSV before parsing the rest of XML.

Any tips on the best (most sane) way to approach this would be gratefully received - I hope there are some shortcuts I'm not aware of rather than a very dense script full of deeply nested 'if' arguments to check the possible existence of each field.

Thanks.
example-XML-input.xml
example-required-CSV-output.csv
0
williamskellorn
Asked:
williamskellorn
  • 5
  • 3
  • 2
4 Solutions
 
Geert BormansCommented:
In my experience of transforming the most complex and nested XML into CSV, I have always found that
- precooked tools fail as soon as some complexity comes in
- XSLT usually gives all the required flexibility you need
I would suggest (given teh XML and the required result CSV) that you go down that route,
have an XSLT that transforms your XML directly in the CSV you need. It is pretty easy to execute an XSLT inside a PHP script, so you are safe

How flexible to you need this to be? I can cook you an XSLT if you want

I have answered a bunch of questions on EE already dealing with XML, XSLT and CSV, you might find some inspiration there
0
 
williamskellornAuthor Commented:
Hi Gertone,

Thanks for the speedy answer!

The only varying factors are likely to be number of platforms, number of satellite passes per platform, and number of messages per satellite - so I think a bespoke XSLT script is looking like a good option.

Hadn't come across XSLT before, I'll dig into it (and your other posts) - but if it's a quickie for you to put together something that does the job, or gets me on the way, that would be fantastic!
0
 
Geert BormansCommented:
I think I covered the flexibility you need, yet I did not over-engineered stuff, so quite a bit is hard coded.... but I think this will work for you
XML-2-CSV.xsl
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
Geert BormansCommented:
Let me know if you don't know how to initiate an XSLT in PHP
0
 
Ray PaseurCommented:
Gertone is usually pretty good with this stuff, and if you like his solution, go with it.  But I am moved to ask, "Why are you doing this at all?"  XML is a hierarchical document (and in counting the curly-braces from the object, it looks to me like you have ten levels of hierarchy here), and a CSV is a two-dimensional tabular representation of data like a spreadsheet.  There is not really a good way to represent the hierarchy in CSV, and such a transformation usually results in an unwieldy amount of repetition.  Maybe if you can tell us a little more about where the data comes from and how it needs to be consumed, we might be able to suggest an alternate scheme.
0
 
Geert BormansCommented:
Hi Ray, that is an interesting question and I can not answer it for williamskellorn of course.
But it is a request I often get outside EE... and in my conversations with customers for XML2CSV stuff, it usually boils down to the same reason: there is a system that generates hierarchical reports in XML and a human wants to verify the reports in Excel. And allthought there is an awfull lot of repetition (in this case I believe the number of value fields in the table are about double the fields from the XML) it makes sense in some way.
So rather than asking the question "why?", I tend to think, "spreadsheet visualisation" for sure.

But you are very right to ask this question. If the purpose of this exercise would be to load the result CSV in a one table relational database, we should think about a different multi table approach (to name one thing)
0
 
Ray PaseurCommented:
Yep, Your assumptions are probably correct and that's right where I am going with the question!
0
 
williamskellornAuthor Commented:
Gertone and Ray - Thank you both for thorough and rigourous answers to the initial question.

To fill in a little of the reasoning - data currently is provided via daily CSV attachments and these are stored as well as being parsed into a single table mySQL database. Currently fewer messages are delivered as there is an algorithm in effect that chooses the 'best' message for a satelite pass. We can move to retrieve the data via a webservice which returns octet stream xml, in the form shown in the example.

In the short-term we need to replicate the original workflow as closely as possible so that the data may still be stored as intermediate CSV files, mirroring the current workflow while we run both workflows in tandem.

Once we have moved over to the webservice alone, I would then potentially look to remove the intermediate storage step and write the xml directly to the database. Either way, the reduction of the number of messages to pick the best will need to be replicated our end, thus a single table approach will probably still be sufficient.

I'll be back on this project tomorrow and will look at initiating the XSLT script you have kindly provided via PHP then, as well as consider the flexibility versus hard-coded trade-off.

I'll mark up Gertone's solution providing the XSLT as accepted now though - didn't want to make it look like I wasn't grateful!
0
 
williamskellornAuthor Commented:
Very impressed by the swiftness, thought and effort afforded to my first question on EE.

My thanks to Gertone and Ray_Paseur!
0
 
Geert BormansCommented:
welcome,
don't hesitate to come back here if you need changes to the XSLT, explanation of what it does and how it does it, or help initiating it from PHP
0

Featured Post

Important Lessons on Recovering from Petya

In their most recent webinar, Skyport Systems explores ways to isolate and protect critical databases to keep the core of your company safe from harm.

  • 5
  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now