Parsing a non standard XML format

I have an XML format that i'm downloading from a server that looks something like this...

Since all of the column headings are wrapped in the <COLUMNS> record and each set of data is in the <DATA> record and separated by tabs, what is a good way to parse this with PHP?

The ultimate goal is to push the data into my MySQL db.
<SETME ReplyCode="0" ReplyText="Operation Successful">
	<DELIMITER value="09"/>
	<COLUMNS>Favorite Fruit	Favorite Sandwich	Favorite TV Show</COLUMNS>
	<DATA>Watermelon	Peanut butter and Jelly	Beverly Hillbillies</DATA>
	<DATA>Apple	Pastrami on Rye	NYPD Blue</DATA>
	<DATA>Orange	Hamburger	Star Trek</DATA>
	<MAXROWS/>
</SETME>

Open in new window

LVL 1
befidledAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

gambit_642Commented:
The second line has a delimiter value which specifies the ascii value used to divide the column entries.
It is "09" in your example which is a "tab" space.

What you want to do is an explode() on the columns line to get the field name, like:
$column _titles = explode('\t', $column_line);

Do a similar explode on each "data" line
Then do a foreach() loop on the resulting array to drop them in to one associative array of which the keys are named from your column_title array.
0
gambit_642Commented:
Had some spare time so I write it:

This will take the XML and drop each entry in a associative (titles from column line) array called "data".
<?php
$filename = "your.xml"; // Set File name
 
 
$tags = array("<DATA>", "</DATA>", "<COLUMNS>", "</COLUMNS>");
$handle = fopen($filename, "rb");
 
$entry_count = 0;
 
while (!feof($handle)) {
	$this_line = fgets($handle, filesize($filename)); // read a line
	$this_line = trim($this_line); // clean white space ends
	if(stristr($this_line, "<COLUMNS>")) { // If Column title
		$this_line = str_replace($tags, "", $this_line); // Get rid of tags
		$column_titles = explode ("\t", $this_line); // Use tabs to break string, put in column array
	}
	elseif(stristr($this_line, "<DATA>")) { // If data entry
		$this_line = str_replace($tags, "", $this_line); // Get rid of tags
		$data_entry[$entry_count] = explode("\t", $this_line); // Use tabs to break string, put in data array
		$entry_count++;
	}
}
fclose($handle);
 
// Build 1 assosative array from $column_titles and $data_entry's
$count = 0;
foreach($data_entry as $entry){
	$sub_count = 0;
	foreach($entry as $field){
		$data[$count][$column_titles[$sub_count]] = $field;
		$sub_count++;
	}
	$count++;
}
 
// All data is in associative array "$data"
print_r($data);
?>

Open in new window

0
RurneCommented:
If you're working in PHP5, you can take advantage of the DOM functions that are included specifically for XML parsing:

http://us.php.net/manual/en/ref.dom.php

For your example document, I'd recommend something along the lines of:
<?php
 
$fname = 'example.xml';   // dummy file name
 
$xmlParser = new DOMDocument();
$xmlParser->load($fname);
 
/* Extract out the different nodes */
$delim_ls = $xmlParser->getElementsByTagName('DELIMITER');
$delimiter = $delim_ls->item(0);  // $delim_ls is a DOMNodeList with just one item in it
 
$colum_ls = $xmlParser->getElementsByTagName('COLUMNS');
$column = $colum_ls->item(0);
 
$data_ls  = $xmlParser->getElementsByTagName('DATA');
 
$delim_char = $delimiter->getAttribute('value');   // numeric value of delimiter; '09' in above case
$delim_char = chr($delim_char)                     // convert from decimal value to actual character
 
$columns = explode($delim_char, $column->nodeValue);
 
for($i = 0; $i < $data_ls->length; $i++)
{
    $data_row = $data_ls->item($i);
    $data = explode($delim_char, $data_row->nodeValue);
    /* do some querying here
     * basic example, but it's still recommended to look into 
     * using mysql_real_escape_string() to sanitize your data
     *
     * mysql_query("INSERT INTO myTable (" . implode(',', $columns) . ') VALUES ("' . implode('","', $data) . '")');
     *
     */
}
 
?>

Open in new window

0
gambit_642Commented:
That will work two...
your missing termination on line 18;
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
RurneCommented:
Good catch.  Too much Python coding lately, it seems.
Regardless, it's usually a bit more flexible to use the DOM tools to manipulate XML documents, and it gives you the flexibility to detect things like the delimiter, for example, and it also handles non-significant whitespace gracefully.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
PHP

From novice to tech pro — start learning today.