?
Solved

Extracting a subset of XML using PHP

Posted on 2011-02-28
10
Medium Priority
?
409 Views
Last Modified: 2012-05-11
I have some XML

<articles>
<article code='1'>
 <title>Some Title</title>
 <body>Some Text</body>
</article>
<article code='2'>
 <title>Another Title</title>
 <body>Some More Text</body>
</article>
</articles>

I need to extract the XML for each article into a string for insertion into a database. Something like

foreach (article){
$string = article;
doSomethingWithString();
}

The string would be "<article code='1'><title>Some Title</title><body>Some Text</body>/article>"

How can I do this?

Thanks

Mike
0
Comment
Question by:hungoveragain
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
  • 2
  • +1
10 Comments
 
LVL 7

Expert Comment

by:szewkam
ID: 35004686
There is a couple of solution for your problem. For example use SimpleXML from php (code snippet)
You could also use regular expression (http://pl.php.net/manual/en/function.preg-match-all.php)


<?php
$xmlstr = "<articles>
<article code='1'>
 <title>Some Title</title>
 <body>Some Text</body>
</article>
<article code='2'>
 <title>Another Title</title>
 <body>Some More Text</body>
</article>
</articles>";

$xml = new SimpleXMLElement($xmlstr);

foreach($xml->article as $article) {
  echo $article->title.'<br />'.$article->body.'<br />';
}

Open in new window

0
 

Author Comment

by:hungoveragain
ID: 35004781
Unfortunately this doesn't give me XML. Please also bear in mind that I won't necessarily know what the tags / attributes are. There may be some attributes that change dynamically.

if the XML is
<articles>
<article code='1'>
 <title>Some Title</title>
 <body>Some Text</body>
</article>
<article code='2'>
 <title>Another Title</title>
 <body>Some More Text</body>
</article>
</articles>

I will need the string to be

"<article code='1'><title>Some Title</title><body>Some Text</body></article>"

but if there is an additional attribute such as name='somename' the string will need to be

"<article code='1' name='somename'><title>Some Title</title><body>Some Text</body></article>"

Basically I need a substring that starts with each <article> and ends with each </article> but including those.

Thanks

Mike
0
 
LVL 7

Expert Comment

by:szewkam
ID: 35004850
even with additional attribute that string is pure xml and simple_xml will deal with it without problems, and my script will work.
As long as all your atricles are in <article>, titles and bodies in <title>,<body> this will work despite of extra arguments
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:hungoveragain
ID: 35004916
But the code above doesn't spit out valid XML

Using your above code I get

SomeTitle<br />
SomeText<br />
AnotherTitle<br />
Some More Text<br />

what I need is

"<article code='1'><title>Some Title</title><body>Some Text</body></article>"

Additionally if there are some unexpected tags or attributes within the XML they will be missed.

Mike
0
 
LVL 34

Expert Comment

by:Beverley Portlock
ID: 35004931
I would use SimpleXML myself, but if your XML is in a string, here is a code fragment that uses a regex to pull the data

<?php

$xmlString = "<articles>
<article code='1'>
 <title>Some Title</title>
 <body>Some Text</body>
</article>
<article code='2'>
 <title>Another Title</title>
 <body>Some More Text</body>
</article>
</articles>";


preg_match_all( '!(<article\s+[^>]*>.*?</article>)!s', $xmlString, $matches );

print_r( $matches [1] );

Open in new window

0
 

Author Comment

by:hungoveragain
ID: 35004943
Just for the sake of clarity I intend to put the XML in a database table.

code || xml
1 || <article code='1'><title>Some Title</title><body>Some Text</body></article>
2 || <article code='2'><title>Another Title</title><body>Some More Text</body></article>

and so on.

Mike
0
 
LVL 7

Expert Comment

by:szewkam
ID: 35005464
ok, I didn't undestand what you are trying to achieve.
using my code (in snippet).
<?php
$xmlstr = "<articles>
<article code='1' sometag='test'>
 <title anotherattribute='0'>Some Title</title>
 <body>Some Text</body>
</article>
<article code='2'>
 <title>Another Title</title>
 <body>Some More Text</body>
</article>
</articles>";

$xml = new SimpleXMLElement($xmlstr);

foreach($xml->article as $article) {
  echo "<article code='".$article['code']."'><title>".$article->title."</title><body>".$article->body."</body></article>";
}

Open in new window

0
 
LVL 34

Expert Comment

by:Beverley Portlock
ID: 35005928
UNTESTED code below

<?php

$xmlString = "<articles>
<article code='1'>
 <title>Some Title</title>
 <body>Some Text</body>
</article>
<article code='2'>
 <title>Another Title</title>
 <body>Some More Text</body>
</article>
</articles>";


preg_match_all( '!(<article\s+[^>]*>.*?</article>)!s', $xmlString, $matches );

foreach( $matches[1] as $aMatch ) {

     preg_match('!.+code=\'([0-9]+)\'.+!s', $aMatch, $codeArray );

     if ( isset( $codeArray[1] ) ) {
          mysql_query("INSERT INTO myTable ( code, xml ) VALUES( {$codeArray[1]}, '". mysql_real_escape_string( $aMatch )."' ) ");
     }
}

Open in new window

0
 
LVL 111

Accepted Solution

by:
Ray Paseur earned 2000 total points
ID: 35006201
I'm not sure about putting the XML into the data base - you might be able to store the serialized object  and get better performance that way.  But if you want to isolate nodes of the XML structure and keep them as XML, this shows how to use the AsXML() method to retrieve the XML.  HTH, ~Ray
<?php // RAY_temp_hungoveragain.php
error_reporting(E_ALL);
echo "<pre>";

// THE XML FROM THE EXAMPLE AT EE (SLIGHTLY MODIFIED)
$xml = <<<XML
<articles>
<article code='1' foo='Bar'>
 <title>Some Title</title>
 <body>Some Text</body>
</article>
<article code='2'>
 <title>Another Title</title>
 <body>Some More Text</body>
</article>
</articles>
XML;

// MAKE AN OBJECT
$obj = SimpleXML_Load_String($xml);

// // ACTIVATE THIS TO LOOK AT THE OBJECT WE JUST MADE
// var_dump($obj);

// ITERATE OVER THE OBJECT TO EXTRACT ARTICLES
foreach ($obj as $article)
{
    // RENDER EACH ARTICLE INTO XML
    $str = $article->AsXML();
    echo PHP_EOL;
    echo htmlentities($str);
    echo PHP_EOL;
}

Open in new window

0
 
LVL 111

Expert Comment

by:Ray Paseur
ID: 35009733
Thanks for the points - it's a really good question, ~Ray
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

3 proven steps to speed up Magento powered sites. The article focus is on optimizing time to first byte (TTFB), full page caching and configuring server for optimal performance.
This article discusses how to implement server side field validation and display customized error messages to the client.
The viewer will learn how to count occurrences of each item in an array.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.
Suggested Courses

649 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question