Link to home
Create AccountLog in
Avatar of Eric - Netminder
Eric - NetminderFlag for United States of America

asked on

Too much traffic?

A colleague (domain obfuscated by me) writes:

"Much as we love the idea of higher traffic, the big numbers become less desirable when it becomes clear that there are no actual eyeballs behind many of those visits.

We're seeing a surge in non-human traffic on Domain Online, and are not quite sure what to do about it. These are not day-to-day spikes separated by days of "normal" traffic. Since March of this year, there has been a steady increase in stress on our servers. Four weeks ago, traffic that was already double or triple our "normal" numbers quadrupled. This traffic has been so significant that our webservers (Two IIS servers) have, on occasion, been knocked or nearly-knocked offline.

For the month of May, our Urchin Tracking Monitor, which counts only those sessions and page views from browsers accepting javascript, showed daily traffic at about 35,000 sessions, 3,400,000 hits and 80,000 page views per day. Over the same period our non-UTM stats, which includes traffic of all sorts, show daily traffic of 105,000 sessions, 3,400,000 hits and 1,175,000 page views. Of the total hits we're showing 3,035,000 coming from Robots (63% coming from the Mozilla Compatible agents, 1.3 million identifying themselves as the Googlebot).

We're using a Windows 2003 SQL server database. Our tech folks ruled out the possibility of an SQL injection attack because we weren't getting hit by a single domain range.

Anybody have experience with these kinds of ratios of human-to-non human traffic? Will adding webservers help us? Other solutions?"

Any suggestions would be appreciated.

ep
SOLUTION
Avatar of Jk387
Jk387
Flag of United States of America image

Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
See answer
SOLUTION
Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
Avatar of Eric - Netminder

ASKER

It's likely that the site is being Digged etc., which could account for a lot of the traffic, I suppose. They seem to doubt that it's a DOS attack based on the logs, as noted in the question, but I'll pass it along too.

I'm going to leave this open to see if I get any other ideas/suggestions. I'll also answer any questions as best I can in order to get some other specifics.

ep
Could you let us onto the nature of the website?
Whether it is a forum, blog, etc?
This would throw some more light on the situation.

Getting digged is unlikely - because you mentioned that most of the responses are automated / bot responses.
The digg effect (or when your site gets digged) is when actual people open the site and look at it.
This will register in your urchin records as actual people and not automated bots.
www.poynter.org (realized that there's no compelling reason to hide it).

There's a lot going on there as you can see.

ep
SOLUTION
Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
SOLUTION
Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
SOLUTION
Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
routinet,

I've asked if they subscribe to something like ScanAlert, but your explanaton of spiders seems more plausible.

keith_alabaster,

At this point, I ask questions more for other people than I do for myself; that means that either I'm not doing anything I haven't been doing for a while, or I have all my acquaintances so buffaloed that they think I know everything.

I've not received a reply to the message I sent them (see my comment to routinet; your questions were in the same email), but when I do, I will post immediately. I have also asked if they have considered load balancers.

Redimido,

Thanks. I've already sent to them some information on the use of robots.txt to limit the Googlebot-type scans; in it, I did mention non-Google googlebots, though I've never personally heard of such a thing. That's not to say they don't exist -- just to say that I've never seen one.

ep
I kinda agree with routinet.
It could very well be because your popularity is increasing.

A search on alexa and compete - does show an increase in people visiting the website since Jan 08 and it has been an upward trend.

So I am guessing as more people are reading the content and then blogging and linking back to the articles on their blogs, it could very well be the spiders crawling on your website.
ASKER CERTIFIED SOLUTION
Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
indeed. you need to work into making the site scalable. maybe a load balancer, or a caching network is on order.
Thank you, all. I appreciate the ideas.

As I have not heard back, I'm going to close this question; if some specifics are requested of me, I will open a new question using the Ask A Related Question feature to ensure that you are all notified.

Great work, folks.

ep