Solved

Website post-mortem

Posted on 2009-07-03
4
322 Views
Last Modified: 2013-11-08
I have had a very bad experience that I would like to learn from:

I built a CentOS 5 server running apache, mysql, and drupal for an election application for use at a convention that had about 300 voters.  I built it on a moderately strong PC and attached it to a gigabit switch.  Users connected to the server via a couple of Cisco access points that I put on both sides of a large room.

The users were able to attach to the wireless networks without issue.  The machine acted as a DHCP server without issue also.  Once they tried to log into the drupal system, the server stopped responding to any and all network requests including my attempts to SSH into the machine.

Basically I am sure that the system was far from able to handle the requests from all those users at once.  I am looking for theories on where I went wrong.

Did I grossly underestimate the computer hardware?
Did I blunder by putting all of the services on a single machine?
Was a vanilla install of Apache, Mysql and Drupal in need of major tuning?
How can I stress test such a system to know what it is capable of?

I appreciate your input.  It was a very bad day.
0
Comment
Question by:chronolith
  • 2
4 Comments
 
LVL 12

Accepted Solution

by:
kevin_u earned 250 total points
ID: 24775540
Based on what you said, it didn't seem grossly under-sized.

Tuning could have been an issue, but I have successfully dealt with 1500 posts per minute on an untuned box without loosing control of it, and the users hardly noticed. (i had forgotten to tune it.. quick deploy).

Did you regain control of the machine without a reboot?   Its possible it was a wi-fi overload. 11mb @ 300 users 2 ap's?  Depends on the size(s) of the pages I guess.

You might want to consider that you had malicious attack.  Someone might have decided to vote with a script.  What hints do the logs provide?

Simple stress testing would simply involve scripting the site access and posts using something as simple as curl, or as complicated as a commercial web app testing suite.

Again, I'd be looking at the logs for hints.

0
 
LVL 30

Assisted Solution

by:Kerem ERSOY
Kerem ERSOY earned 250 total points
ID: 24779859
Hi,

I also think that this is not an issue with undersized hardware. IT is not nornal for any system to stop responding. Especially not with 30 users. I myself is the sysdamin for a local Linux-Users-Society and we have more than 1500 members and 2 VCPU's Xen system can handle it quite easily. We have more that 5 lists and lots of tracs Wiki's and several voting areas.

There should be a hardware problem such as RAM or Hard-disk malfunction. Other than that your account of what has happened is not consistent with a system having problems under load.
0
 

Author Comment

by:chronolith
ID: 24787201
I was not able to regain any kind of control and I had to power the thing down.

As for the "server", it was not a server grade machine at all.  Just a mid-level PC.  I was tasked with building this thing without spending money.

I tend to not think it was any kind of malicious attack, this particular community is just not capable of these sorts of things.  In fact they were not even exposed to the system until minutes before the voting was to take place.

The Cisco AP's were both set for G and B.  Right now that is my favorite theory.

My review of the logs did not show anything too terrible apart from the fact that I did not set the apache Max Clients option higher than the vanilla 150 setting.  In that case I would expect to see maybe half of the users getting refused but not all of them.

Should I perhaps think about getting two machines and clustering them?
0
 
LVL 30

Expert Comment

by:Kerem ERSOY
ID: 24789730
I don't think you need to cluster machines. To do this you need to make sure that you have reached very high transactions per second. It seems to me that there was a software or hardware failure in your setup.

To verify this I'll suggest you to setup your computer again. I'll suggest you to get sysstat package through yum and use sar -q and iosatat commands to verify that this is a load problem.
0

Featured Post

Ransomware-A Revenue Bonanza for Service Providers

Ransomware – malware that gets on your customers’ computers, encrypts their data, and extorts a hefty ransom for the decryption keys – is a surging new threat.  The purpose of this eBook is to educate the reader about ransomware attacks.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This document is written for Red Hat Enterprise Linux AS release 4 and ORACLE 10g.  Earlier releases can be installed using this document as well however there are some additional steps for packages to be installed see Metalink. Disclaimer: I hav…
The purpose of this article is to show how we can create Linux Mint virtual machine using Oracle Virtual Box. To install Linux Mint we have to download the ISO file from its website i.e. http://www.linuxmint.com. Once you open the link you will see …
Microsoft Active Directory, the widely used IT infrastructure, is known for its high risk of credential theft. The best way to test your Active Directory’s vulnerabilities to pass-the-ticket, pass-the-hash, privilege escalation, and malware attacks …
Established in 1997, Technology Architects has become one of the most reputable technology solutions companies in the country. TA have been providing businesses with cost effective state-of-the-art solutions and unparalleled service that is designed…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question