[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 826
  • Last Modified:

PHP XML XPATH problem

Hello,
I have the following XML string that gets generated via a function call.

I have to take this XML and obtain all the email addresses from all the nodes that are called "apps:property" and have the attribute "name='email'"

I tried doing it with simpleXML but it seems that it can't handle nodes with : in the names.
Also tried using various other codes (also) included with no luck.


<?php
$xml =<<<EOT
<feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:apps='http://schemas.google.com/apps/2006'>
	<id>https://apps-apis.google.com/a/feeds/group/2.0/mydomain.com/testgroup/owner</id>
	<updated>2009-04-15T21:36:22.207Z</updated>
	<link rel='http://schemas.google.com/g/2005#feed' type='application/atom xml' href='https://apps-apis.google.com/a/feeds/group/2.0/mydomain.com/testgroup/owner'/>
	<link rel='http://schemas.google.com/g/2005#post' type='application/atom xml' href='https://apps-apis.google.com/a/feeds/group/2.0/mydomain.com/testgroup/owner'/>
	<link rel='self' type='application/atom xml' href='https://apps-apis.google.com/a/feeds/group/2.0/mydomain.com/testgroup/owner'/>
	<openSearch:startIndex>1</openSearch:startIndex>
	<entry>
		<id>https://apps-apis.google.com/a/feeds/group/2.0/mydomain.com/testgroup/owner/admin%40mydomain.com</id>
		<updated>2009-04-15T21:36:22.207Z</updated>
		<link rel='self' type='application/atom xml' href='https://apps-apis.google.com/a/feeds/group/2.0/mydomain.com/testgroup/owner/admin%40mydomain.com'/>
		<link rel='edit' type='application/atom xml' href='https://apps-apis.google.com/a/feeds/group/2.0/mydomain.com/testgroup/owner/admin%40mydomain.com'/>
		<apps:property name='email' value='admin@mydomain.com'/>
		<apps:property name='type' value='User'/>
	</entry>
	<entry>
		<id>https://apps-apis.google.com/a/feeds/group/2.0/mydomain.com/testgroup/owner/user2%40mydomain.com</id>
		<updated>2009-04-15T21:36:22.207Z</updated>
		<link rel='self' type='application/atom xml' href='https://apps-apis.google.com/a/feeds/group/2.0/mydomain.com/testgroup/owner/user2%40mydomain.com'/>
		<link rel='edit' type='application/atom xml' href='https://apps-apis.google.com/a/feeds/group/2.0/mydomain.com/testgroup/owner/user2%40mydomain.com'/>
		<apps:property name='email' value='user2@mydomain.com'/>
		<apps:property name='type' value='User'/>
	</entry>	
</feed>
 
EOT;
 
$doc = new DOMDocument;
$doc->preserveWhiteSpace = FALSE;
 
$doc->LoadXML($xml);
//print $doc;
 
 
$params = $doc->getElementsByTagName('apps'); // Find Sections 
sizeof($params);
foreach ($params as $param) //go to each section 1 by 1
{
	print "looping params<br>";
	print $param->getAttribute('name') ;
	print $param->getAttribute('value') ;
	print "-------<br>\n";
}
	
$xpath = new DOMXpath ($doc);
 
$arts = $xpath->query("//entry/apps:property[@name='email']"); 
 
//
//At this  pont $arts should contain the nodes.. but it does not...
//
 
foreach ($arts as $art) 
{
    echo "Nodevalue" . $art->nodeValue."<br />";
}
 
foreach ($arts as $art) {
    echo "looping<hr>";
 
 
    for ($nodo = $art->firstChild; $nodo !== NULL; $nodo = $nodo->nextSibling) 
    {
		print "nodo->nodeName  = " . $nodo->nodeName  ."<br>";
		print "nodo->hasAttributes    = " . (string)$nodo->hasAttributes()   ."<br>";
		
 
    }
 
    foreach ($art->attributes as $atributo) {
    		print "atributo->nodeName = " . $atributo->nodeName . "<br>\n";
    }
 
} 
 
//
// XMLReader is the only thing thus far that can fully parse the XML but I do not want
// to navigate the XML, I want to search it.
//
 
$xmlr = new XMLReader();
$xmlr->xml($xml); 
 
    $assoc = xml2assoc($xmlr);
    $xmlr->close();
    print_r($assoc); 
 
function xml2assoc($xmlr) {
    $tree = null;
    while($xmlr->read())
        switch ($xmlr->nodeType) {
            case XMLReader::END_ELEMENT: return $tree;
            case XMLReader::ELEMENT:
                $node = array('tag' => $xmlr->name, 'value' => $xmlr->isEmptyElement ? '' : xml2assoc($xmlr));
                if($xmlr->hasAttributes)
                    while($xmlr->moveToNextAttribute())
                        $node['attributes'][$xmlr->name] = $xmlr->value;
                $tree[] = $node;
            break;
            case XMLReader::TEXT:
            case XMLReader::CDATA:
                $tree .= $xmlr->value;
        }
    return $tree;
}
 
?>

Open in new window

0
sinner052397
Asked:
sinner052397
  • 3
  • 3
2 Solutions
 
abelCommented:
If you are stuck using a non-NS compliant XPath processor (sorry, I don't know this one from having practiced with it) you can still find the namespaced nodes by using the following XPath-allowed trick (it won't work if there are other elements with the same name but with a different namespace):

XPath:
//*[local-name() = 'property'][@name = 'email']

will return all elements that have the localname "property" (like apps:property) and have an attribute "name" with value "email".

-- Abel --
0
 
Ray PaseurCommented:
"I tried doing it with simpleXML but it seems that it can't handle nodes with : in the names."

Right, but there are ways to get around that and use SimpleXML!  Here is a sample that shows how.  Run it and see what you get.

HTH, ~Ray
<?php // RAY_temp_sinner.php
error_reporting(E_ALL);
 
// FROM THE OP
// I have to take this XML and obtain all the email addresses from all the nodes that are called "apps:property" and have the attribute "name='email'"
 
$xml =<<<EOT
<feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:apps='http://schemas.google.com/apps/2006'>
        <id>https://apps-apis.google.com/a/feeds/group/2.0/mydomain.com/testgroup/owner</id>
        <updated>2009-04-15T21:36:22.207Z</updated>
        <link rel='http://schemas.google.com/g/2005#feed' type='application/atom xml' href='https://apps-apis.google.com/a/feeds/group/2.0/mydomain.com/testgroup/owner'/>
        <link rel='http://schemas.google.com/g/2005#post' type='application/atom xml' href='https://apps-apis.google.com/a/feeds/group/2.0/mydomain.com/testgroup/owner'/>
        <link rel='self' type='application/atom xml' href='https://apps-apis.google.com/a/feeds/group/2.0/mydomain.com/testgroup/owner'/>
        <openSearch:startIndex>1</openSearch:startIndex>
        <entry>
                <id>https://apps-apis.google.com/a/feeds/group/2.0/mydomain.com/testgroup/owner/admin%40mydomain.com</id>
                <updated>2009-04-15T21:36:22.207Z</updated>
                <link rel='self' type='application/atom xml' href='https://apps-apis.google.com/a/feeds/group/2.0/mydomain.com/testgroup/owner/admin%40mydomain.com'/>
                <link rel='edit' type='application/atom xml' href='https://apps-apis.google.com/a/feeds/group/2.0/mydomain.com/testgroup/owner/admin%40mydomain.com'/>
                <apps:property name='email' value='admin@mydomain.com'/>
                <apps:property name='type' value='User'/>
        </entry>
        <entry>
                <id>https://apps-apis.google.com/a/feeds/group/2.0/mydomain.com/testgroup/owner/user2%40mydomain.com</id>
                <updated>2009-04-15T21:36:22.207Z</updated>
                <link rel='self' type='application/atom xml' href='https://apps-apis.google.com/a/feeds/group/2.0/mydomain.com/testgroup/owner/user2%40mydomain.com'/>
                <link rel='edit' type='application/atom xml' href='https://apps-apis.google.com/a/feeds/group/2.0/mydomain.com/testgroup/owner/user2%40mydomain.com'/>
                <apps:property name='email' value='user2@mydomain.com'/>
                <apps:property name='type' value='User'/>
        </entry>
</feed>
 
EOT;
// END OF DATA FROM THE OP
 
// MUNG THE XML TO MAKE IT USEFUL WITH SIMPLEXML
$munged_xml = str_replace('apps:', 'apps_', $xml);
 
// MAKE AN OBJECT
$obj = SimpleXML_Load_String($munged_xml);
 
// ITERATE OVER THE OBJECT
foreach ($obj->entry as $my_entry)
{
 
// IS THIS THE THING WE WANT?
   if ($my_entry->apps_property["name"] == "email")
   {
 
// YES - SHOW IT
      $my_email = $my_entry->apps_property["value"];
      echo "<br/>$my_email \n";
   }
}

Open in new window

0
 
abelCommented:
> // MUNG THE XML TO MAKE IT USEFUL WITH SIMPLEXML

here you actually change the XML, I would not advice that... if you still need the same XML you need to keep a pointer to the original and the amended XML.. but it is a way around the problem with SimpleXML.

If you want to keep the XML normal (i.e., uncrippled) and use normal valid XPaths, see my earlier comments.
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
Ray PaseurCommented:
@abel: If you install and run the code I posted above you will see that it works.  You will also see that I do NOT in fact, change the original XML; I copy it to a new variable in the process of changing the  values.  That would be clear from the line of code at #37.

Whether to simply mung the original or copy and mung the copy is a consideration that is largely made based on the size of the XML file.  If the file is so large that it will cause storage issues, it might make sense to mung the original, secure in the knowledge that you can always reload a clean copy.

best regards, ~Ray
0
 
sinner052397Author Commented:
Abel: Your xpath query was correct, but not complete in the sense that there was no PHP code included. I had some trouble with php navigating the resuls (getattribute vs get_attribute)

Ray: Worked like a charm. As I stated in the initial requirement. All I need to do is extract the emails from the XML so there is problem with modifying the XML string.

0
 
Ray PaseurCommented:
Thanks for the points - it's a great question and one that gets asked a lot!  Best, ~Ray
0
 
abelCommented:
> Your xpath query was correct, but not complete in the sense that there was no PHP code included.

np, I only focused on the XPath part, as PHP is not my current native language ;)

> You will also see that I do NOT in fact, change the original XML

apologies, my mistake, I started barking before I read the whole code... Still it wouldn't be my preferred method, but then again, it wouldn't be my preferred method to use a non-compliant XPath processor with compliant XML, it only brings more and more work for workarounds ... ;-)
0

Featured Post

Important Lessons on Recovering from Petya

In their most recent webinar, Skyport Systems explores ways to isolate and protect critical databases to keep the core of your company safe from harm.

  • 3
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now