Link to home
Start Free TrialLog in
Avatar of ChilliSauce
ChilliSauceFlag for Afghanistan

asked on

PHP - Strange results when I try and retrieve some websites ?? get_meta_tags & Curl

Here's my code:
<?
$URL[1] = 'http://seolinkdirectory.org'; 
$URL[2] = 'http://moz.com';

foreach($URL as $Key => $Website)
{
	$Meta = get_meta_tags($Website);
	foreach($Meta as $Key1 => $Value1)
	{
		echo $Website . ' ' . $Key1 . ' = ' . $Value1 . '<br>';	
	}
	$Page = GetWebPage($Website);
	echo $Website . "<br><br><br>Webpage <br><br><br><br>" . $Page;
}

function GetWebPage($URL) 
{
	$ua = 'Mozilla/5.0 (Windows NT 5.1; rv:16.0) Gecko/20100101 Firefox/16.0 (ROBOT)';
     
	

    $ch             = curl_init();

    curl_setopt($ch, CURLOPT_URL,            $URL);
    curl_setopt($ch, CURLOPT_USERAGENT,      $ua);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, True);
    curl_setopt($ch, CURLOPT_NOBODY,         False);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, True);
    curl_setopt($ch, CURLOPT_BINARYTRANSFER, True);
    $content = curl_exec($ch);
	curl_close($ch);
	return $content;
}
?>

Open in new window


When I run this code, it works fine for the domain http://moz.com , but for http://seolinkdirectory.org the get_meta_tags function returns nothing and the GetWebPage function returns a load of gibberish!

Can anyone identify why this is happening ?
Avatar of Ray Paseur
Ray Paseur
Flag of United States of America image

Check these pages with "view source."  It looks like the seo link directory is generated using Javascript libraries.  This is a technique that some publishers have adopted to prevent "screen scraping" with cURL.  It is a way that they protect their copyrighted content.  If you want access to their content and they want you to have programmatic access (this is probably a paid relationship) the publisher will usually expose an API.  You might ask about that.
ASKER CERTIFIED SOLUTION
Avatar of gr8gonzo
gr8gonzo
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial