PHP DOM getElementById

I'm trying to use php dom's getElementById, and I'm just doing the basically tutorial, but it doesn't work.  I just get the:
The element whose id is books is:
as my output.

My book.xml code:

 <books>
  TEST BOOK
  </books>


and my php code:


<?php
 
$doc = new DomDocument;
 
// We need to validate our document before refering to the id
$doc->validateOnParse = true;
$doc->Load('book.xml');
 
echo "The element whose id is books is: " . $doc->getElementById('books')->tagName . "\n";
 
?>

Open in new window

LVL 1
walker6o9Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Ray PaseurCommented:
Just curious - have you tried using SimpleXML instead?
0
walker6o9Author Commented:
Yes, I can do this with SimpleXML, but for what I'm ultimately going to try to do getElementById with PHP DOM will work a lot better.
0
walker6o9Author Commented:
I'm trying to do the tutorial here:

http://theserverpages.com/php/manual/en/function.dom-domdocument-getelementbyid.php

but it doesn't say what the code for book.xml should be, and it seems to me like I'm writing it wrong.
0
Build an E-Commerce Site with Angular 5

Learn how to build an E-Commerce site with Angular 5, a JavaScript framework used by developers to build web, desktop, and mobile applications.

Ray PaseurCommented:
Here is how I would go about this...
<?php // RAY_simplexml_9.php
error_reporting(E_ALL);
echo "<pre>";
 
// TEST DATA FROM THE OP
$xml = '
  <books>
  TEST BOOK
  </books>';
 
// MAKE AN OBJECT
$obj = SimpleXML_Load_String($xml);
 
// VISUALIZE THE DATA IN OBJECT FORM
var_dump($obj);
 
// PROVIDE A WRAPPER SO WE CAN DO THIS THE RIGHT WAY
$valid_xml = '<rezults>' . $xml . '</rezults>';
$valid_obj = SimpleXML_Load_String($valid_xml);
 
// SHOW THE DATA VALUE
echo (string)$valid_obj->books;

Open in new window

0
walker6o9Author Commented:
I appreciate your response, but I already know how to use SimpleXML for that particular purpose.  What I'm trying to do though is be able to grab the information in both HTML and XML files with php based on their element ID tag.
0
Roger BaklundCommented:
First, you must have id attributes in the xml:

<?xml version="1.0" encoding="iso-8859-1"?>
<books>
  <book id="books">TEST BOOK</book>
  <book id="books2">TEST BOOK2</book>
</books>

Second, you need to identify the id attributes. It can be done like this:
$doc = new DomDocument("1.0");
 
// We need to validate our document before refering to the id
$doc->validateOnParse = true;
$doc->Load('book.xml');
 
// define id attributes
foreach($doc->getElementsByTagName('book') as $book)
  $book->setIdAttribute('id',true);
 
echo "The element whose id is books is: " . $doc->getElementById('books')->tagName . "\n";

Open in new window

0
walker6o9Author Commented:
If I wanted to change book.xml to book.html, what would the html file look like?  Or would I need to change the php code?
0
walker6o9Author Commented:
CXR- Also, the above example returns
The element whose id is books is: book.

Is there a way to have it return TEST BOOK or TEST BOOK2

P.S. Thank you.
0
Ray PaseurCommented:
what the code for book.xml - Yeah, I've been wondering about that myself!
0
Ray PaseurCommented:
Maybe this will be helpful.

http://us3.php.net/manual/en/class.domdocument.php#91072

Also, you might want getElementsByTagName instead of looking for the ID.
0
Roger BaklundCommented:
>> Is there a way to have it return TEST BOOK or TEST BOOK2

Use the "nodeValue" property:

http://php.net/manual/en/class.domnode.php#domnode.props.nodevalue
echo $doc->getElementById('books')->nodeValue;

Open in new window

0
walker6o9Author Commented:
That's working really well.  I tried using an .html file, instead of an .xml, and it works fine (I'm swapping xml text in for html text).

The issue I'm having though is if there are any links or inner divs inside, it seems that this won't work if there are any interior <> .  Is there a way to to make it work with <>?
0
walker6o9Author Commented:
CXR -

This line appears to be causing some problems:
foreach($doc->getElementsByTagName('book') as $book)
  $book->setIdAttribute('id',true);


But I can't seem to get it to work without that line.  However, the example provided by the php manual does not have it?
<?php
 
$doc = new DomDocument;
 
// We need to validate our document before refering to the id
$doc->validateOnParse = true;
$doc->Load('book.xml');
 
echo "The element whose id is books is: " . $doc->getElementById('books')->tagName . "\n";
 
?> 

Open in new window

0
Roger BaklundCommented:
The example in the manual does not work for me either. This excerpt from the manual gives a hint, though:

" For this function to work, you will need either to set some ID attributes with DOMElement::setIdAttribute  or a DTD which
  defines an attribute to be of type ID. In the later case, you will need to validate your document with
  DOMDocument::validate or DOMDocument->validateOnParse before using this function. "

http://php.net/manual/en/domdocument.getelementbyid.php

The book.xml used in the manual probably has a DTD, that is why validateOnParse is set to true in the example. Without a DTD, you need to use setIdAttribute() if you are going to use getElementById().
0
walker6o9Author Commented:
So, basically, this won't work for parsing html then?
0
Roger BaklundCommented:
Depends on the html... it should work for valid xhtml (which also is xml), but will probably not work well with malformed "quirks mode" html, which is what you most often will find on the net.
0
walker6o9Author Commented:
Can you show me an example of this grabbing an element by id from an XHTML file?
0
Roger BaklundCommented:
The challenge is to find a valid xhtml page... ;)

Try this:
$doc = new DomDocument;
$doc->resolveExternals = true;
$doc->Load('http://www.w3.org/');
echo "The element whose id is slogan is: " . $doc->getElementById('slogan')->tagName . "<br />\n";
var_dump($doc->getElementById('slogan')->nodeValue);

Open in new window

0
walker6o9Author Commented:
That worked fine.  Then I tried to copy and paste the source code from www.w3.org into a file called "test.html", and put it in the same folder as my php code, and I got:
The element whose id is slogan is:
NULL

<?php
 
$doc = new DomDocument;
$doc->resolveExternals = true;
$doc->Load('test.html');
echo "The element whose id is slogan is: " . $doc->getElementById('slogan')->tagName . "<br />\n";
var_dump($doc->getElementById('slogan')->nodeValue);
 
?>

Open in new window

0
walker6o9Author Commented:
Ignore my last comment, I made a mistake.

This seems to work really well, but it only returns text.  For example, the id logo returns blank.  Is there a way to get it to return

<img alt="The World Wide Web Consortium (W3C)" height="48" width="315" src="/Icons/w3c_main" />

0
walker6o9Author Commented:
Or include linlks for example, rather than just straight text.
0
Roger BaklundCommented:
The element with id=logo is a h1 element, and it contains an img element.

To get the child element of the h1 element, you can use the firstChild property. To extract an element as a string, you can use the saveXML() method of the DOMDocument object. To see the string in a html context, you must use htmlentities().

Try this:
$doc = new DomDocument;
$doc->resolveExternals = true;
$doc->Load('http://www.w3.org/');
echo "The element whose id is logo is: " . $doc->getElementById('logo')->tagName . "<br />\n";
$logo = $doc->getElementById('logo');
$img = $doc->saveXML($logo->firstChild);
echo htmlentities($img);

Open in new window

0
Roger BaklundCommented:
>> Or include linlks for example, rather than just straight text.

What do you mean?
0
walker6o9Author Commented:
If you had an h1 tag that looked like this:

<h1 id="logo">This is a <a href="link.html">link</a> test</h1>

And you wanted to grab

This is a <a href="link.html">link</a> test

0
Roger BaklundCommented:
That h1 element contains multiple child elements: A text node with value "This is a ", an anchor element containing another text node with value "link", and finally a text node with value " test".

If you wanted to include the h1 tags and the content in your output, it would be easier, you could just do

$s = $doc->saveXML($logo);

To get just the content of the h1 element, you must loop and output each of the child nodes:
$str = '<h1 id="logo">This is a <a href="link.html">link</a> test</h1>';
$doc = new DomDocument;
$doc->LoadXML($str);
foreach($doc->getElementsByTagName('h1') as $h1)
  $h1->setIdAttribute('id',true);
echo "The element whose id is logo is: " . $doc->getElementById('logo')->tagName . "<br />\n";
$logo = $doc->getElementById('logo');
foreach($logo->childNodes as $o) {
  $s = $doc->saveXML($o);
  echo htmlentities($s);
}

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
walker6o9Author Commented:
The worked well.  I tried to incorporate it into a search and replace, where the header is replaced by text from an .xml file.  This works really well, EXCEPT, if there is a link.  

So this

<h1 id="logo">This is a link test</h1>

Will get changed, but this won't

<h1 id="logo">This is a <a href="link.html">link</a> test</h1>

Even though they both output with the echo statement fine.
<?php
 
$docx = new DomDocument("1.0");
 
// We need to validate our document before refering to the id
$docx->validateOnParse = true;
$docx->Load('book.xml');
 
// define id attributes
foreach($docx->getElementsByTagName('div') as $bookx)
  $bookx->setIdAttribute('id',true);
 
$desired_content = $docx->getElementById('books')->nodeValue;
 
$outputfile = 'output.html';
$url = 'test.html';
$content = file_get_contents($url);
$between = '';
 
$doc = new DomDocument;
$doc->resolveExternals = true;
$doc->Load('test.html');
$logo = $doc->getElementById('logo');
foreach($logo->childNodes as $o) {
  $s = $doc->saveXML($o);
  $between = $between.$s; 
}
$output = str_replace($between, $desired_content, $content);
echo $between;
$fh = fopen($outputfile, 'w') or die("Can't open file the output file");
fwrite($fh, $output);
fclose($fh);
?>

Open in new window

0
walker6o9Author Commented:
Typo on like 28.  Code should have read like this
<?php
$docx = new DomDocument("1.0");
 
// We need to validate our document before refering to the id
$docx->validateOnParse = true;
$docx->Load('book.xml');
 
// define id attributes
foreach($docx->getElementsByTagName('div') as $bookx)
  $bookx->setIdAttribute('id',true);
 
 $desired_content = $docx->getElementById('books')->nodeValue;
 
$outputfile = 'output.html';
$url = 'test.html';
$content = file_get_contents($url);
$between = '';
 
$doc = new DomDocument;
$doc->resolveExternals = true;
$doc->Load('test.html');
$logo = $doc->getElementById('logo');
foreach($logo->childNodes as $o) {
  $s = $doc->saveXML($o);
  $between = $between.$s;
}
$output = str_replace($desired_content, $between, $content);
echo $between;
$fh = fopen($outputfile, 'w') or die("Can't open file the output file");
fwrite($fh, $output);
fclose($fh);
?>

Open in new window

0
Roger BaklundCommented:
It seems to me the first version was correct... the syntax for str_replace is:

str_replace  ( mixed $search  , mixed $replace  , mixed $subject  [, int &$count  ] )

The string to search for comes first, and the string to insert comes second.

If you can't figure it out, please open a new question. This question was about using getElementById(). Thank you! :)
0
Roger BaklundCommented:
Btw, I said earlier that malformed html would not work well, I forgot about this method:

http://php.net/manual/en/domdocument.loadhtmlfile.php
0
walker6o9Author Commented:
Sorry, didn't realize we'd drifted into a new topic.  Thanks for your help, I appreciate it.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
PHP

From novice to tech pro — start learning today.