PHP - Strange results when I try and retrieve some websites ?? get_meta_tags & Curl

Here's my code:
$URL[1] = ''; 
$URL[2] = '';

foreach($URL as $Key => $Website)
	$Meta = get_meta_tags($Website);
	foreach($Meta as $Key1 => $Value1)
		echo $Website . ' ' . $Key1 . ' = ' . $Value1 . '<br>';	
	$Page = GetWebPage($Website);
	echo $Website . "<br><br><br>Webpage <br><br><br><br>" . $Page;

function GetWebPage($URL) 
	$ua = 'Mozilla/5.0 (Windows NT 5.1; rv:16.0) Gecko/20100101 Firefox/16.0 (ROBOT)';

    $ch             = curl_init();

    curl_setopt($ch, CURLOPT_URL,            $URL);
    curl_setopt($ch, CURLOPT_USERAGENT,      $ua);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, True);
    curl_setopt($ch, CURLOPT_NOBODY,         False);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, True);
    curl_setopt($ch, CURLOPT_BINARYTRANSFER, True);
    $content = curl_exec($ch);
	return $content;

Open in new window

When I run this code, it works fine for the domain , but for the get_meta_tags function returns nothing and the GetWebPage function returns a load of gibberish!

Can anyone identify why this is happening ?
Who is Participating?
gr8gonzoConnect With a Mentor ConsultantCommented:
Sure. This is because the is returning a gzip-compressed webpage, not the raw HTML. My suggestion would be to add this curl option to GetWebPage:

curl_setopt($ch, CURLOPT_ENCODING, "none");

That way, the site will return the standard, uncompressed HTML (since it thinks cURL can't handle gzip) and you'll be able to see it normally.

Then, save the results of GetWebPage to a local, temporary file and then use get_meta_tags on that local file, and unlink() it afterwards. You'll get the right results AND you'll save an extra web call.
Ray PaseurCommented:
Check these pages with "view source."  It looks like the seo link directory is generated using Javascript libraries.  This is a technique that some publishers have adopted to prevent "screen scraping" with cURL.  It is a way that they protect their copyrighted content.  If you want access to their content and they want you to have programmatic access (this is probably a paid relationship) the publisher will usually expose an API.  You might ask about that.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.