cbielich
asked on
Using DOMNodeList to get links with rel=nofollow using php
I have this code but does not seem to be working or I am not figuring out how to get the data I need into a php variable
<?
$html = 'http://www.website.com';
$doc = new DOMDocument();
libxml_use_internal_errors (true);
$doc->loadHTML($html); // loads your html
$xpath = new DOMXPath($doc);
// returns a list of all links with rel=nofollow
$nlist = $xpath->query('//a[@rel="n ofollow"]' );
foreach ($nlist as $link) {
//condition here
}
?>
the foreach loop does not seem to be finding the links. Is there a way i can do this? I need to extract the URL from the <a> that contains the rel=nofollow
also the "nofollow" is wrapped in double quotes. What if a page has them wrapped in single quotes. Is that going to still find those? I need to make sure I find both.
<?
$html = 'http://www.website.com';
$doc = new DOMDocument();
libxml_use_internal_errors
$doc->loadHTML($html); // loads your html
$xpath = new DOMXPath($doc);
// returns a list of all links with rel=nofollow
$nlist = $xpath->query('//a[@rel="n
foreach ($nlist as $link) {
//condition here
}
?>
the foreach loop does not seem to be finding the links. Is there a way i can do this? I need to extract the URL from the <a> that contains the rel=nofollow
also the "nofollow" is wrapped in double quotes. What if a page has them wrapped in single quotes. Is that going to still find those? I need to make sure I find both.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
$html = 'http://www.website.com';I am guessing that is not the real URL you want to inspect. Please post the ACTUAL URL and I will see if I can show you how to find the information you want.
Open in new window
if it even works on your system, the least you could do is switch it off again after use, but I guess using DOMDocument is not really secure and there are better ways to load the content of another website (like curl, but I have never used that yet).