Solved

How to block spiders from my site except google & yahoo

Posted on 2010-09-05
5
388 Views
Last Modified: 2013-12-09
Hi,

I want to block all crawlers/spiders who enter into my site and alter my stats and db resources.

I would like only to allow google.

How to do this?

Thank you
0
Comment
Question by:Fernanditos
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
5 Comments
 
LVL 96

Expert Comment

by:Lee W, MVP
ID: 33605979
Create a robots.txt file - if the search engine obeys it's settings, this should suffice (what, no Bing?)http://www.searchtools.com/robots/robots-txt.html
0
 
LVL 1

Expert Comment

by:martinnolan
ID: 33606033
I would suggest using Google Analytics for your website statistics. It’s free, very easy to add to your site and in answer to part of your question, because it uses javascript as the tracking mechanism spiders don’t trigger this and therefore will not show or skew your website statistics.

Is database resource a major issue...?

Robots are good but make sure you get this 100% correct as scripted incorrectly you could block the major search engines - maybe check http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=156449

0
 

Author Comment

by:Fernanditos
ID: 33606079
Is this robots.txt syntax correct to allow access only to google, yahoo, msn, adsese and bing?

Do you consider there are others very important?

In practice I only get traffic from google.

Please review the code and tell me if syntax is correct, im afraid not.
User-agent: googlebot
User-agent: Slurp
User-agent: msnbot
User-agent: bingbot
User-agent: AdsBot-google

Disallow:
User-agent: *
Disallow: /

Open in new window

0
 

Author Comment

by:Fernanditos
ID: 33606113
martinnolan, thank you but the question was onother thing. I just want to know how to ALLOW only the mentioned agents amd exclude all others
0
 
LVL 96

Accepted Solution

by:
Lee W, MVP earned 500 total points
ID: 33607567
I've never configured anything special in a robots.txt  - I would suggest you try this validator and see what it says:
http://tool.motoricerca.info/robots-checker.phtml
0

Featured Post

Is Your Team Achieving Their Full Potential?

74% of employees feel they are not achieving their full potential. With Linux Academy, not only will you strengthen your team's core competencies but also their knowledge of of the newest IT topics.

With new material every week, we'll make sure that you stay ahead of the game.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

SEO can be a real minefield to navigate, but there are three simple ways to up your SEO game just be re-assessing your content output.
Australian government abolished Visa 457 earlier this April and this article describes how this decision might affect Australian IT scene and IT experts.
Viewers will get an overview of the benefits and risks of using Bitcoin to accept payments. What Bitcoin is: Legality: Risks: Benefits: Which businesses are best suited?: Other things you should know: How to get started:
The is a quite short video tutorial. In this video, I'm going to show you how to create self-host WordPress blog with free hosting service.

695 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question