Avatar of gateguard
gateguard
 asked on

web crawlers shutting down my site

I have a tomcat site running on a server2008r2 machine.

My site is no longer responsive.

a netstat -a command yields the results in the attached file

how do I block all those things from jamming my port 80?

thanks
netstata2.txt
Apache Web Server

Avatar of undefined
Last Comment
Guy Lidbetter

8/22/2022 - Mon
SOLUTION
Guy Lidbetter

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
gateguard

ASKER
Thanks, Guy.  I'm going to try these solutions.

I do have a question regarding the wikipedia article, something I don't understand.

The article talks about keeping crawlers out of the site or out of specific folders, but what I don't understand is why are the crawlers establishing all those port 80 connections, which is effectively shutting down the site.  Is that their intent?

Thanks again.
ASKER CERTIFIED SOLUTION
Guy Lidbetter

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
SOLUTION
Lucas Bishop

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
gateguard

ASKER
where do I put a robots.txt file?  I don't have one right now.
Lucas Bishop

It goes in the root folder of your web site.
This is the best money I have ever spent. I cannot not tell you how many times these folks have saved my bacon. I learn so much from the contributors.
rwheeler23
Guy Lidbetter

hi gateguard,

The wiki page in my very first post explains everything about the robots file.

Regards

Guy