Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Converting XML to CSV with PHP

Posted on 2013-01-04
10
Medium Priority
?
1,487 Views
Last Modified: 2013-01-07
I need to produce a PHP script (or find a function that can be called within the script) to regularly convert an XML file to CSV, examples of which are attached.

I have started looking at converting the XML to an object using PHP 5 simplexml_load_file, and then iterating through to write values to a CSV file using fputcsv.

However, I've hit three snags that I'm not sure how to go about solving -

The number of rows written to the CSV are dictated by the number of '<message>' fields in the xml document, but using a 'foreach' argument based on <message> would then need to be able to retrieve values from parent fields of that message (and children of fields on the same level of hierarchy as <message>.
Some fields are not included for all <satellitePass> (see second row in CSV for bird 2) - Not sure how to ensure these are consistently represented by an empty value in the CSV without having to do a check for each possible value to see if it is present.
Field keys need to be converted to header names for the first row of the CSV - This is not vital - they shouldn't change so could be pre-written to the CSV before parsing the rest of XML.

Any tips on the best (most sane) way to approach this would be gratefully received - I hope there are some shortcuts I'm not aware of rather than a very dense script full of deeply nested 'if' arguments to check the possible existence of each field.

Thanks.
example-XML-input.xml
example-required-CSV-output.csv
0
Comment
Question by:williamskellorn
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
  • 2
10 Comments
 
LVL 60

Assisted Solution

by:Geert Bormans
Geert Bormans earned 1900 total points
ID: 38744465
In my experience of transforming the most complex and nested XML into CSV, I have always found that
- precooked tools fail as soon as some complexity comes in
- XSLT usually gives all the required flexibility you need
I would suggest (given teh XML and the required result CSV) that you go down that route,
have an XSLT that transforms your XML directly in the CSV you need. It is pretty easy to execute an XSLT inside a PHP script, so you are safe

How flexible to you need this to be? I can cook you an XSLT if you want

I have answered a bunch of questions on EE already dealing with XML, XSLT and CSV, you might find some inspiration there
0
 

Author Comment

by:williamskellorn
ID: 38744503
Hi Gertone,

Thanks for the speedy answer!

The only varying factors are likely to be number of platforms, number of satellite passes per platform, and number of messages per satellite - so I think a bespoke XSLT script is looking like a good option.

Hadn't come across XSLT before, I'll dig into it (and your other posts) - but if it's a quickie for you to put together something that does the job, or gets me on the way, that would be fantastic!
0
 
LVL 60

Accepted Solution

by:
Geert Bormans earned 1900 total points
ID: 38745422
I think I covered the flexibility you need, yet I did not over-engineered stuff, so quite a bit is hard coded.... but I think this will work for you
XML-2-CSV.xsl
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
LVL 60

Expert Comment

by:Geert Bormans
ID: 38745426
Let me know if you don't know how to initiate an XSLT in PHP
0
 
LVL 111

Assisted Solution

by:Ray Paseur
Ray Paseur earned 100 total points
ID: 38746560
Gertone is usually pretty good with this stuff, and if you like his solution, go with it.  But I am moved to ask, "Why are you doing this at all?"  XML is a hierarchical document (and in counting the curly-braces from the object, it looks to me like you have ten levels of hierarchy here), and a CSV is a two-dimensional tabular representation of data like a spreadsheet.  There is not really a good way to represent the hierarchy in CSV, and such a transformation usually results in an unwieldy amount of repetition.  Maybe if you can tell us a little more about where the data comes from and how it needs to be consumed, we might be able to suggest an alternate scheme.
0
 
LVL 60

Assisted Solution

by:Geert Bormans
Geert Bormans earned 1900 total points
ID: 38747170
Hi Ray, that is an interesting question and I can not answer it for williamskellorn of course.
But it is a request I often get outside EE... and in my conversations with customers for XML2CSV stuff, it usually boils down to the same reason: there is a system that generates hierarchical reports in XML and a human wants to verify the reports in Excel. And allthought there is an awfull lot of repetition (in this case I believe the number of value fields in the table are about double the fields from the XML) it makes sense in some way.
So rather than asking the question "why?", I tend to think, "spreadsheet visualisation" for sure.

But you are very right to ask this question. If the purpose of this exercise would be to load the result CSV in a one table relational database, we should think about a different multi table approach (to name one thing)
0
 
LVL 111

Expert Comment

by:Ray Paseur
ID: 38747337
Yep, Your assumptions are probably correct and that's right where I am going with the question!
0
 

Author Comment

by:williamskellorn
ID: 38750186
Gertone and Ray - Thank you both for thorough and rigourous answers to the initial question.

To fill in a little of the reasoning - data currently is provided via daily CSV attachments and these are stored as well as being parsed into a single table mySQL database. Currently fewer messages are delivered as there is an algorithm in effect that chooses the 'best' message for a satelite pass. We can move to retrieve the data via a webservice which returns octet stream xml, in the form shown in the example.

In the short-term we need to replicate the original workflow as closely as possible so that the data may still be stored as intermediate CSV files, mirroring the current workflow while we run both workflows in tandem.

Once we have moved over to the webservice alone, I would then potentially look to remove the intermediate storage step and write the xml directly to the database. Either way, the reduction of the number of messages to pick the best will need to be replicated our end, thus a single table approach will probably still be sufficient.

I'll be back on this project tomorrow and will look at initiating the XSLT script you have kindly provided via PHP then, as well as consider the flexibility versus hard-coded trade-off.

I'll mark up Gertone's solution providing the XSLT as accepted now though - didn't want to make it look like I wasn't grateful!
0
 

Author Closing Comment

by:williamskellorn
ID: 38750206
Very impressed by the swiftness, thought and effort afforded to my first question on EE.

My thanks to Gertone and Ray_Paseur!
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 38750224
welcome,
don't hesitate to come back here if you need changes to the XSLT, explanation of what it does and how it does it, or help initiating it from PHP
0

Featured Post

How to Use the Help Bell

Need to boost the visibility of your question for solutions? Use the Experts Exchange Help Bell to confirm priority levels and contact subject-matter experts for question attention.  Check out this how-to article for more information.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Part of the Global Positioning System A geocode (https://developers.google.com/maps/documentation/geocoding/) is the major subset of a GPS coordinate (http://en.wikipedia.org/wiki/Global_Positioning_System), the other parts being the altitude and t…
I was working on a PowerPoint add-in the other day and a client asked me "can you implement a feature which processes a chart when it's pasted into a slide from another deck?". It got me wondering how to hook into built-in ribbon events in Office.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…

715 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question