Link to home
Create AccountLog in
Avatar of mmguide
mmguide

asked on

google Webmaster tools strange urls

Hi,
When I check my webmaster tools I have some very odd pages appearing as duplicats. These pages are also accessible, despite not being releveant or existing and not showing properly.

e.g mysite.com/page.php/folder/bfolder/cfolder/sitemap.php?offset=240
or
page.php?offset?offset=480

This resolves to a page but is not linked to. I wonder how I can stop this page exisitng.
Avatar of Dave Baldwin
Dave Baldwin
Flag of United States of America image

It sounds like you are using a CMS that looks up the pages in a database and rewrites the URLs.  If you have a CMS like Joomla or Drupal, you should click on "Request Attention" and get those zones added to your question so those experts will look at your question.  This doesn't sound like an Apache problem.
Avatar of mmguide
mmguide

ASKER

Hi,
I am using a self-built php page system that uses a database.

i.e. locationA-page.php URL is written from  page.php?location=locationA
 
in an  .htaccess
Have you uploaded a sitemap? Google will try to access anything that looks like a URL when it scans your site.  Is there any possibility that your htaccess code could generate two different URLs that map to the same page?
Avatar of mmguide

ASKER

Possibly.
Is there a way to ensure it maps to a specific URL in htaccess?  
(I have  changed all the site  links to the absolute URL)

Should I do this in the htaccess file and if so do you have an examples?
You can use 'mod_rewrite' to rewrite the URLs but I've never done that partly because I don't really understand the rules.  You would have to click on "Request Attention" above and get some others to look at your question for that.
If Google has indexed it then there's a 99% chance there is a link somewhere using those dodgy URLs

You need to find them and fix them.

Once fixed you have several things you can do to help things along:

setup .htaccess to 301 redirect those dodgy links back to the canonical page
add canonical tags
block the dodgy links in robots.txt
tell Google via WMT to ignore those parameters
you could 301 based on a pattern in php


function right_url($url)
if ($_SERVER['REQUEST_URI'] != $correct_url) {
 header ('HTTP/1.1 301 Moved Permanently');
  header ('Location: ' . $correct_url);
  exit;
}

right_url('/my_correct_url');
Avatar of mmguide

ASKER

Hi,
virmaior:
Shouldn't that be:
function right_url($correct_url)
if ($_SERVER['REQUEST_URI'] != $correct_url) {
 header ('HTTP/1.1 301 Moved Permanently');
  header ('Location: ' . $correct_url);
  exit;
}

right_url('/my_correct_url');

Tiggerito:
Finding the dodgy url is very tricky. webmaster tools doesn't make it easy. Any ideas?

Avatar of mmguide

ASKER

Isn't there a way to redirect anything after  location.php (apart from an offset parameter) to location.php

i.e incorrect url:   mysite.com/location.php/folder/bfolder/
correct url: mysite.com/location.php


mysite.com/location.php?offset=10 (is also a correct url potentially depending on record numbers)
ASKER CERTIFIED SOLUTION
Avatar of Tony McCreath
Tony McCreath
Flag of Australia image

Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
See answer
mmguide -> sorry about the error.

you can do what you want with directory rewriting using ModRewrite.

RewriteEngine On
RewriteRule ^location\.php([^\/+]) /location.php

but this won't know the offset, etc.  Apache does not know those things, so you would need to handle the redirects in php.
SOLUTION
Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
Avatar of mmguide

ASKER

Xenu help to sort out a few problems and the script I used seemed to solve others.