Solved

Extracting a subset of XML using PHP

Posted on 2011-02-28
10
399 Views
Last Modified: 2012-05-11
I have some XML

<articles>
<article code='1'>
 <title>Some Title</title>
 <body>Some Text</body>
</article>
<article code='2'>
 <title>Another Title</title>
 <body>Some More Text</body>
</article>
</articles>

I need to extract the XML for each article into a string for insertion into a database. Something like

foreach (article){
$string = article;
doSomethingWithString();
}

The string would be "<article code='1'><title>Some Title</title><body>Some Text</body>/article>"

How can I do this?

Thanks

Mike
0
Comment
Question by:hungoveragain
  • 3
  • 3
  • 2
  • +1
10 Comments
 
LVL 7

Expert Comment

by:szewkam
ID: 35004686
There is a couple of solution for your problem. For example use SimpleXML from php (code snippet)
You could also use regular expression (http://pl.php.net/manual/en/function.preg-match-all.php)


<?php
$xmlstr = "<articles>
<article code='1'>
 <title>Some Title</title>
 <body>Some Text</body>
</article>
<article code='2'>
 <title>Another Title</title>
 <body>Some More Text</body>
</article>
</articles>";

$xml = new SimpleXMLElement($xmlstr);

foreach($xml->article as $article) {
  echo $article->title.'<br />'.$article->body.'<br />';
}

Open in new window

0
 

Author Comment

by:hungoveragain
ID: 35004781
Unfortunately this doesn't give me XML. Please also bear in mind that I won't necessarily know what the tags / attributes are. There may be some attributes that change dynamically.

if the XML is
<articles>
<article code='1'>
 <title>Some Title</title>
 <body>Some Text</body>
</article>
<article code='2'>
 <title>Another Title</title>
 <body>Some More Text</body>
</article>
</articles>

I will need the string to be

"<article code='1'><title>Some Title</title><body>Some Text</body></article>"

but if there is an additional attribute such as name='somename' the string will need to be

"<article code='1' name='somename'><title>Some Title</title><body>Some Text</body></article>"

Basically I need a substring that starts with each <article> and ends with each </article> but including those.

Thanks

Mike
0
 
LVL 7

Expert Comment

by:szewkam
ID: 35004850
even with additional attribute that string is pure xml and simple_xml will deal with it without problems, and my script will work.
As long as all your atricles are in <article>, titles and bodies in <title>,<body> this will work despite of extra arguments
0
3 Use Cases for Connected Systems

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, testing some more, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us.

 

Author Comment

by:hungoveragain
ID: 35004916
But the code above doesn't spit out valid XML

Using your above code I get

SomeTitle<br />
SomeText<br />
AnotherTitle<br />
Some More Text<br />

what I need is

"<article code='1'><title>Some Title</title><body>Some Text</body></article>"

Additionally if there are some unexpected tags or attributes within the XML they will be missed.

Mike
0
 
LVL 34

Expert Comment

by:Beverley Portlock
ID: 35004931
I would use SimpleXML myself, but if your XML is in a string, here is a code fragment that uses a regex to pull the data

<?php

$xmlString = "<articles>
<article code='1'>
 <title>Some Title</title>
 <body>Some Text</body>
</article>
<article code='2'>
 <title>Another Title</title>
 <body>Some More Text</body>
</article>
</articles>";


preg_match_all( '!(<article\s+[^>]*>.*?</article>)!s', $xmlString, $matches );

print_r( $matches [1] );

Open in new window

0
 

Author Comment

by:hungoveragain
ID: 35004943
Just for the sake of clarity I intend to put the XML in a database table.

code || xml
1 || <article code='1'><title>Some Title</title><body>Some Text</body></article>
2 || <article code='2'><title>Another Title</title><body>Some More Text</body></article>

and so on.

Mike
0
 
LVL 7

Expert Comment

by:szewkam
ID: 35005464
ok, I didn't undestand what you are trying to achieve.
using my code (in snippet).
<?php
$xmlstr = "<articles>
<article code='1' sometag='test'>
 <title anotherattribute='0'>Some Title</title>
 <body>Some Text</body>
</article>
<article code='2'>
 <title>Another Title</title>
 <body>Some More Text</body>
</article>
</articles>";

$xml = new SimpleXMLElement($xmlstr);

foreach($xml->article as $article) {
  echo "<article code='".$article['code']."'><title>".$article->title."</title><body>".$article->body."</body></article>";
}

Open in new window

0
 
LVL 34

Expert Comment

by:Beverley Portlock
ID: 35005928
UNTESTED code below

<?php

$xmlString = "<articles>
<article code='1'>
 <title>Some Title</title>
 <body>Some Text</body>
</article>
<article code='2'>
 <title>Another Title</title>
 <body>Some More Text</body>
</article>
</articles>";


preg_match_all( '!(<article\s+[^>]*>.*?</article>)!s', $xmlString, $matches );

foreach( $matches[1] as $aMatch ) {

     preg_match('!.+code=\'([0-9]+)\'.+!s', $aMatch, $codeArray );

     if ( isset( $codeArray[1] ) ) {
          mysql_query("INSERT INTO myTable ( code, xml ) VALUES( {$codeArray[1]}, '". mysql_real_escape_string( $aMatch )."' ) ");
     }
}

Open in new window

0
 
LVL 109

Accepted Solution

by:
Ray Paseur earned 500 total points
ID: 35006201
I'm not sure about putting the XML into the data base - you might be able to store the serialized object  and get better performance that way.  But if you want to isolate nodes of the XML structure and keep them as XML, this shows how to use the AsXML() method to retrieve the XML.  HTH, ~Ray
<?php // RAY_temp_hungoveragain.php
error_reporting(E_ALL);
echo "<pre>";

// THE XML FROM THE EXAMPLE AT EE (SLIGHTLY MODIFIED)
$xml = <<<XML
<articles>
<article code='1' foo='Bar'>
 <title>Some Title</title>
 <body>Some Text</body>
</article>
<article code='2'>
 <title>Another Title</title>
 <body>Some More Text</body>
</article>
</articles>
XML;

// MAKE AN OBJECT
$obj = SimpleXML_Load_String($xml);

// // ACTIVATE THIS TO LOOK AT THE OBJECT WE JUST MADE
// var_dump($obj);

// ITERATE OVER THE OBJECT TO EXTRACT ARTICLES
foreach ($obj as $article)
{
    // RENDER EACH ARTICLE INTO XML
    $str = $article->AsXML();
    echo PHP_EOL;
    echo htmlentities($str);
    echo PHP_EOL;
}

Open in new window

0
 
LVL 109

Expert Comment

by:Ray Paseur
ID: 35009733
Thanks for the points - it's a really good question, ~Ray
0

Featured Post

3 Use Cases for Connected Systems

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, testing some more, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
.htaccess 5 33
php function to remove a file 31 39
Advice and best practice  SQLite3 Database using PHP 2 46
Download tables into separate sheets 3 25
Part of the Global Positioning System A geocode (https://developers.google.com/maps/documentation/geocoding/) is the major subset of a GPS coordinate (http://en.wikipedia.org/wiki/Global_Positioning_System), the other parts being the altitude and t…
I was working on a PowerPoint add-in the other day and a client asked me "can you implement a feature which processes a chart when it's pasted into a slide from another deck?". It got me wondering how to hook into built-in ribbon events in Office.
The viewer will learn how to count occurrences of each item in an array.
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

815 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

8 Experts available now in Live!

Get 1:1 Help Now