Solved

Mod Rewrite - Pages load fine, but facebook and google think 404

Posted on 2011-10-01
2
477 Views
Last Modified: 2012-05-12
I need someone with experience using mod_rewrite.

I have a connect script running in sub director /health/ (you can see the connect script below).

The connect script is in file index.php. It populates data from remote DB with urls that appear like so:

/health/index.php?resource=/assets/heart

I would like them to appear like this:

/health/assets/heart

When I changed the request handler from:

"request_handler_uri=" . urlencode("/index.php?resource="),

To

"request_handler_uri=" . urlencode("/health"),

The url structure appears corectly, yet I get 404.

So I changed the htaccess to:

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /health/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
#RewriteRule index.php(.*)$ index.php?resource=/$1 [QSA]
RewriteRule ^(.*)$ index.php?resource=/$1 [QSA]
</IfModule>

So the page loads fine, I can see it etc, but not google or facebook etc. When I test using
developers.facebook.com/tools/debug/og/object

Facebook returns error "the server responded 404 error".

Any ideas?

<?php

$api_uri = "http://web.contentsource.com/blahblah";

$parameters = array(
	"apikey"				=> "5551212",
	"format"				=> "atom",
	"links"					=> "resource-path",
	"styles"				=> "enhanced",
	"content_only"			=> "false",
	"prettyprint"			=> "false",
	"request_handler_uri"	=> "http://www.mydomain.com/health/"
);

if (array_key_exists("resource", $_GET))
{
	$resource = urlencode($_GET['resource']);
}

if (!isset($resource) || $resource == "/")
{
	//default content uri goes here	
	$resource = "/assets/~default";
}

$request_uri = $api_uri . $resource . "?" . http_build_query($parameters);

$curl = curl_init($request_uri);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 1);
curl_setopt($curl, CURLOPT_TIMEOUT, 5);
$response = curl_exec($curl);
    
header("Content-type: text/html; charset=utf-8");


/*$pattern='#<entry(.*)\<content type=\"xhtml\">#si';
preg_match($pattern,$response,$piece);*/



$pattern='#\<id>(.*)\<\/id>#si';
preg_match($pattern,$response,$piece);

$pattern='#\<summary type=\"text\">(.*)\<\/summary>#si';
preg_match($pattern,$response,$description);

$pattern='#\<author>(.*)\<\/author>#si';
preg_match($pattern,$response,$author);

$pattern='#\<updated>(.*)\<\/updated>#si';
preg_match($pattern,$response,$updated);


$response=str_replace($piece[0],"",$response);
$response=str_replace($description[0],"",$response);
$response=str_replace($author[0],"",$response);
$response=str_replace($updated[0],"",$response);


$pattern='#\<title>(.*)\<\/title>#si';
preg_match($pattern,$response,$title);
$title=$title[0];

$pattern='#\<summary type=\"text\">(.*)\<\/summary>#si';
preg_match($pattern,$response,$description);
$description='<meta name="description" content=\''.$description[1].'\'>';

$pattern='#<link(.*)/>#U';
preg_match_all($pattern,$piece[0],$links);
$links=$links[0];

$new_content = strip_tags($response);
$new_content = eregi_replace("<head[^>]*>.*</head>"," ",$new_content);
$new_content = eregi_replace("<script[^>]*>.*</script>"," ",$new_content);
$new_content = eregi_replace("<style[^>]*>.*</style>"," ",$new_content);
$new_content = eregi_replace("<[^>]*>"," ",$new_content);
$new_content = eregi_replace("&nbsp;","",$new_content);

$resource = urldecode($resource);
$theurl = "http://www.mydomain.com/health" . $resource;
?>

Open in new window

0
Comment
Question by:cptnem0
2 Comments
 
LVL 15

Accepted Solution

by:
babuno5 earned 500 total points
ID: 36898258
your code seems to be fine

What can be checked now is when you make request from facebook check your apache access log and see for what url you are getting 404.



0
 

Author Closing Comment

by:cptnem0
ID: 37065093
Checking apache logs did lead to finding the problem. There was a { somewhere causing an error.
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

If you are a web developer, you would be aware of the <iframe> tag in HTML. The <iframe> stands for inline frame and is used to embed another document within the current HTML document. The embedded document could be even another website.
This article discusses how to create an extensible mechanism for linked drop downs.
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

861 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now