• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 304
  • Last Modified:

Site stopped appearing in google Could a flawed robot text cause this

A different client (different from another one I mentioned here) suddenly had his site stop being spidered/cached in Google (and apparently in Yahoo).

Here's a snippet from his page HTML source:  META NAME="robots" CONTENT="index,follow">

He ran a validator test and got this result regarding the robot.txt file:

Syntax check robots.txt on robots.txt (30 bytes)
Line Severity Code
4 ERROR There should be atleast 1 disallow line in any Robots.txt.
User-agent: *
 We're sorry, this robots.txt does NOT validate.
 Warnings Detected: 1
 Errors Detected: 1
4 warning An empty user agent field was detected. Each User-Agent record should have atleast one disallow line per record. This error may have also been generated due to bad line enders.
User-agent: *


  robots.txt source code for robots.txt
Line Code
 1  # Robots.txt
 2  #
 3  
 4

---------------------------- end of text result ---

Is the above file so flawed that it could stop any spiders?   I think so, but wanted any feedback about this from the experts here.


Any comments/suggestions/solutions appreciated!

Thanks

Rowby
0
Rowby Goren
Asked:
Rowby Goren
  • 2
2 Solutions
 
duzCommented:
Rowby -

Certainly the robots.txt is invalid and should be deleted immediately.  

There is no way of telling what the bots will do with invalid syntax in robots.txt we only know how they behave when it's valid.  

- duz
0
 
humeniukCommented:
The proper syntax to allow all robots to visit your entire site:

# Robots.txt
#

User-agent: *
Disallow:

 
However, if that is your intent (to allow all), you should just do without the file altogether.  There is anecdotal evidence that invalid robots.txt syntax is a problem for SE spiders and since we know invalid html code can inhibit spiders, that makes some sense, but - as duz noted - there is no way of knowing for sure.  When it comes to search engines, it's better to play it safe.

BTW - your page doesn't do too well in terms of validation: http://validator.w3.org/check?uri=http%3A%2F%2Fwww.the-merchant-account-advisor.com.
0
 
Rowby GorenAuthor Commented:
Thanks for your solutions/suggestions, humeniuk & duz.

I have forwarded them to the client. He has  deleted the robots.txt file for now and will either replace it or use it correctly.

I told him about the invalid html and and encouraging him to fix it.

Rowby
0
 
humeniukCommented:
Glad to be able to help, rowby.  Thanks for the A.
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now