Link to home
Start Free TrialLog in
Avatar of naha
nahaFlag for United Kingdom of Great Britain and Northern Ireland

asked on

Googlebot asking for strange pages & NOT index.htm

Hi,
  I created a website ( www.thefirstaidbox.co.uk ) a few months ago & it still has not been crawled by a googlebot properly. I looked at my logs & googlebot has visited twice - each time it did a GET (only) for the following :

/robots.txt               - Expected this (& I guess I should create one)
/search.html           - No such page  (So I'll probably create one!)
/cgi-bin/ocb/ocb.cgi
/quikstore.html
/quikcode.html+

The last three appear to be related to some e-commerce software - which this site doesn't use (we use a different package)

I'm very confused as to why the googlebot asked for these pages & why it didn't try & crawl from index.htm(l) - The site homepage WAS submitted to google.

I'm aware that the site is short of both links and content - which is being worked on - but that doesn't explain the odd googlebot behaviour.

Any ideas?
ASKER CERTIFIED SOLUTION
Avatar of duz
duz
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of naha

ASKER

The site is .co.uk not .com

Hmmm - Looks like the googlebot is looking at the .co.uk site and expecting the same structure!
V V strange. - They are on sperate & unrelated servers.

Ahh - I see what's happened the other site (.com) was set up by a guy who took 10months or so (apparently) to produce it (minute as it is - must have taken all of 5mins) - & he has some duff links to the .co.uk site - Guess I'll take advantage of them then & create real pages!

naha -

>expecting the same structure

Not the same structure so much but this googlebot IS expecting to find the named page at the end of the link.  This is a good example of why it is better to help the googlebot find your new site rather than use the 'Submit Your Site' facility. It allows you to manage the discovery process - a factor that is often overlooked by SEOs.

- duz