Blocking a spider

www.picsearch.com (spider4.picsearch.com ) I presume from the ip address 62.119.133.14 and various ranges is constantly indexing my site.  It seems to visit every day and goes through hundreds of pages and I'm even tracking multiple visits from it at the same time. One day there was 6 simultaneous sessions with over 4000 pages it had visited.
How can I block just this one spider?
LVL 58
GaryAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

LucFEMEA Server EngineerCommented:
Hi GaryC123,

Use robots.txt to disallow the spider to index your site as explained here:
http://www.picsearch.com/menu.cgi?item=FAQ#q5

The spider will obeys the rules in that file.

Greetings,

LucF
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
GaryAuthor Commented:
Maybe I should've explored the site a bit more...
Never used robots.txt.  Do I literally just put a txt file on the server with
User-agent: psbot
Disallow: /

in it?
0
LucFEMEA Server EngineerCommented:
Yes, robots.txt is pretty straight forward.
Just put the robots.txt file in the root of your webpage like Experts Exchange does:
http://www.experts-exchange.com/robots.txt

Greetings,

LucF
0
Cloud Class® Course: CompTIA Healthcare IT Tech

This course will help prep you to earn the CompTIA Healthcare IT Technician certification showing that you have the knowledge and skills needed to succeed in installing, managing, and troubleshooting IT systems in medical and clinical settings.

LucFEMEA Server EngineerCommented:
btw, you might want to know some more about the robots.txt standard, please take a look at http://www.robotstxt.org/

LucF
0
GaryAuthor Commented:
Thanks LucF
0
LucFEMEA Server EngineerCommented:
You're very welcome Gary,

LucF
0
duzCommented:
GaryC123 -

You may want to make a proper job of it :)

User-agent: Wget
User-agent: vsecrawler
User-agent: TutorGig
User-agent: Teleport Pro
User-agent: Steeler
User-agent: semanticdiscovery
User-agent: ScoutAbout
User-agent: RPT-HTTPClient
User-agent: Reaper
User-agent: rabaz
User-agent: QuepasaCreep
User-agent: puf
User-agent: psbot
User-agent: PhpDig
User-agent: OWR_Crawler
User-agent: obot
User-agent: NPBot
User-agent: NexaBot
User-agent: NaverRobot
User-agent: MSIECrawler
User-agent: Larbin
User-agent: Jyxobot
User-agent: InfoNaviRobot
User-agent: http://www.almaden.ibm.com/cs/crawler
User-agent: grub-client
User-agent: Generic
User-agent: Gaisbot
User-agent: EgotoBot
User-agent: Dumbot
User-agent: dloader(NaverRobot)
User-agent: BravoBrian
User-agent: baiduspider
User-agent: asterias
User-agent: ASPSeek
Disallow: /

- duz
0
GaryAuthor Commented:
Ermm who are all them?  I don't want to block everyone as I do want the site indexed, just this one particular robot was eating up bandwidth for nothing and I'm not really interested in having all the images on my site being indexed.
0
LucFEMEA Server EngineerCommented:
duz has a point there, you don't want these spiders crawling your page :o)
Those are undesired, but at least they obey the robots.txt standard, there are also others that don't obey the standard.
0
duzCommented:
GaryC123 -

>Ermm who are all them?

Useless bandwidth eating bots (that obey robots.txt)

- duz

LucF -

>there are also others that don't obey the standard

Well over 150 that I see regularly. If you are interested in stopping them create a 'spider trap' like this one for example http://www.kloth.net/internet/bottrap.php

- duz
0
LucFEMEA Server EngineerCommented:
Thanks for that link duz, I'm trying it now.

LucF
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Internet Marketing

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.