Site stopped appearing in google  Could a flawed robot text cause this

Posted on 2005-04-10
Last Modified: 2010-04-27
A different client (different from another one I mentioned here) suddenly had his site stop being spidered/cached in Google (and apparently in Yahoo).

Here's a snippet from his page HTML source:  META NAME="robots" CONTENT="index,follow">

He ran a validator test and got this result regarding the robot.txt file:

Syntax check robots.txt on robots.txt (30 bytes)
Line Severity Code
4 ERROR There should be atleast 1 disallow line in any Robots.txt.
User-agent: *
 We're sorry, this robots.txt does NOT validate.
 Warnings Detected: 1
 Errors Detected: 1
4 warning An empty user agent field was detected. Each User-Agent record should have atleast one disallow line per record. This error may have also been generated due to bad line enders.
User-agent: *

  robots.txt source code for robots.txt
Line Code
 1  # Robots.txt
 2  #

---------------------------- end of text result ---

Is the above file so flawed that it could stop any spiders?   I think so, but wanted any feedback about this from the experts here.

Any comments/suggestions/solutions appreciated!


Question by:Rowby Goren
    LVL 24

    Accepted Solution

    Rowby -

    Certainly the robots.txt is invalid and should be deleted immediately.  

    There is no way of telling what the bots will do with invalid syntax in robots.txt we only know how they behave when it's valid.  

    - duz
    LVL 33

    Assisted Solution

    The proper syntax to allow all robots to visit your entire site:

    # Robots.txt

    User-agent: *

    However, if that is your intent (to allow all), you should just do without the file altogether.  There is anecdotal evidence that invalid robots.txt syntax is a problem for SE spiders and since we know invalid html code can inhibit spiders, that makes some sense, but - as duz noted - there is no way of knowing for sure.  When it comes to search engines, it's better to play it safe.

    BTW - your page doesn't do too well in terms of validation:
    LVL 9

    Author Comment

    by:Rowby Goren
    Thanks for your solutions/suggestions, humeniuk & duz.

    I have forwarded them to the client. He has  deleted the robots.txt file for now and will either replace it or use it correctly.

    I told him about the invalid html and and encouraging him to fix it.

    LVL 33

    Expert Comment

    Glad to be able to help, rowby.  Thanks for the A.

    Featured Post

    IT, Stop Being Called Into Every Meeting

    Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

    Join & Write a Comment

    Suggested Solutions

    [Part 3 of a 6 part series called SEO Basics: 5 SEO Secrets for Creating Content that Drives Traffic (…
    [Part 5 of a 6 part series called SEO Basics: 5 SEO Secrets for Creating Content that Drives Traffic (…
    Use Wufoo, an online form creation tool, to make powerful forms. Learn how to selectively show certain fields based on user input using rules to gather relevant information and data from your forms. The rules feature provides you with an opportunity…
    This Micro Tutorial will demonstrate how to add subdomains to your content reports. This can be very importing in having a site with multiple subdomains.

    734 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    18 Experts available now in Live!

    Get 1:1 Help Now