robots.txt

Hi

I wan to block access to all robots except google, yahoo and msn to all of my website AND want to block one directory /i for ALL robots INCLUDING google, yahoo and msn. I have created following robots.txt. Can someone please confirm if this is correct? Also want robot not to kill my server therefore go as slow as they can.

User-agent: Googlebot
Disallow:
User-agent: MSNBot
Disallow:
User-agent: Slurp
Disallow:
User-agent: *
Disallow: /
Crawl-delay: 120
Disallow: /i/


I am using Apache on some machines and others are using Nginx all using CentOS.
sysautomationAsked:
Who is Participating?
 
Dave BaldwinFixer of ProblemsCommented:
Looks good to me.  The web server doesn't matter, 'robots.txt' is just a text file.  Also, 'Crawl-delay' is somewhat non-standard, don't know which bots will recognize it.

You must realize that 'robots.txt' does not actually block anything.  'Good' bots will obey your requests.  'Bad' bots won't even read the file.  Unless they want to see what you don't want them to look at.

http://www.robotstxt.org/robotstxt.html
http://en.wikipedia.org/wiki/Robots_exclusion_standard#Crawl-delay_directive
0
 
sysautomationAuthor Commented:
Thanks. Any advice to block 'Bad' bots who don't consider robots.txt ? I See many bots coming from a very different block of IPs hence doesn't seem possible to block from the firewall.
0
 
Dave BaldwinFixer of ProblemsCommented:
No, it's almost impossible to block them  The only reason I block search bots on one directory is to keep it from trying to index every day and time on my calendar page.  Other than that, I don't bother.  The 'good robots' aren't going to access your site very often.  Google is typically once every 3 months unless you become very popular and change content very often.  Bing is similar and Yahoo gets their search results from Bing now so I don't know if they even have search bots anymore.
0
 
Dave BaldwinFixer of ProblemsCommented:
For what it's worth, Baidu, the Chinese search engine, hits my site more than anyone else.

Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.