Link to home
Start Free TrialLog in
Avatar of Bob Schneider
Bob SchneiderFlag for United States of America

asked on

Optimal Robots.txt

Is this an optimal robots.txt file?

# This file can be used to affect how search engines and other web site crawlers see your site.
# For more information, please see http://www.w3.org/TR/html4/appendix/notes.html#h-B.4.1.1
# WebMatrix 2.0

# ----------
# -- bingbot, microsoft indexer
User-agent: Bingbot
Crawl-delay: 10
# ----------
# -- msnbot, microsoft indexer
User-agent: msnbot
Crawl-delay: 10
# ----------
# -- Googlebot
User-agent: googlebot
# ----------
# -- Slurp, Yahoo indexer
Crawl-delay: 10
# ----------
# -- Default for all others
User-agent: *
Disallow: /
ASKER CERTIFIED SOLUTION
Avatar of Dr. Klahn
Dr. Klahn

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
On my customer's site, Facebook does not identify it's crawler except by IP address.  Our tracking shows some Amazon hosts but they look like client sites and not Amazon itself.

And it depends on your purpose for putting 'robots.txt' on your site.  At this point, I believe that Google and some others like Baidu crawl ALL pages and use the 'robots.txt' to decide which to show to the public.  They are trying to catalog Everything on the internet.  Especially Baidu which probably feeds results with certain keywords to the Chinese government.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial