Solved

Find out which process caused a crash!

Posted on 2004-08-30
7
166 Views
Last Modified: 2013-12-15
Hi:

Our server crashed two days ago, and it looks like it ran out of memory and got overloaded.  Is there any way to get an idea of which process caused the crash?  

I ran:

sar -u sa28

and discovered that between 12:40pm and 12:50pm something took over the cpu and didn't let go until it final crashed 5.5 hours later.

My questions are: how can I get a listing of the processes that were running and/or created during that 10 minute period?  Is there any way to tell which processes where demanding the most cpu time at the time of the crash?

Thanks for your help.

-Charlie
0
Comment
Question by:gothamww
  • 4
  • 3
7 Comments
 
LVL 40

Accepted Solution

by:
jlevie earned 125 total points
ID: 11937341
sar only maintains past statistics so you can't get mich more than you have gotten from it. Unless you were running a much more extensive set of logging tools or the process in question caused something to be written to the messages file you probably can tell what ran away with the CPU. But, you might be able to guess who the likely culprits were from what service this machine provides. What services does this box provide?
0
 

Author Comment

by:gothamww
ID: 11942192
mainly it's used for a dynamic web site - so: httpd and mysql would be the main services.  
0
 
LVL 40

Assisted Solution

by:jlevie
jlevie earned 125 total points
ID: 11942419
What scripting language is being used (Perl, PHP, etc). Unless someone has dinked with php.ini and disabled the failsafes PHP should be able to do this as it has runtime memory and execution limits. Perl or C code is another matter.

I'd suggest looking at the web logs for the time of interest and see if it provides any clues as to what was happening.
0
Three Reasons Why Backup is Strategic

Backup is strategic to your business because your data is strategic to your business. Without backup, your business will fail. This white paper explains why it is vital for you to design and immediately execute a backup strategy to protect 100 percent of your data.

 

Author Comment

by:gothamww
ID: 11942589
Thanks for the suggestion, I'll check the web logs.

We use both perl and php.  Just so I understand better what do you mean by "runtime memory and execution limits"?  
0
 
LVL 40

Assisted Solution

by:jlevie
jlevie earned 125 total points
ID: 11945269
In the php.ini file there are limits set for various things, like how much memory a PHP page can, max html page size, max cpu time, etc. Those are there to keep a runaway PHP script from killing the Web server. Perl doesn't have any imposed limits so a Perl script that gets into an infinite loop can kill the server.
0
 

Author Comment

by:gothamww
ID: 11946194

thanks so much - just one last question - is there software out there that will alert the administrator when the load on the server has gotten too high for an extended period of time?  It would be nice to get notified BEFORE the server crashes, if possible.
0
 
LVL 40

Assisted Solution

by:jlevie
jlevie earned 125 total points
ID: 11946720
There are a number of server monitoring packages out there, like BigBrother (http://bb4.com/), Nagios (http://www.nagios.org/), etc. They can be configured to alert on a number of things, including load average.
0

Featured Post

U.S. Department of Agriculture and Acronis Access

With the new era of mobile computing, smartphones and tablets, wireless communications and cloud services, the USDA sought to take advantage of a mobilized workforce and the blurring lines between personal and corporate computing resources.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This is the error message I got (CODE) Error caused by incompatible libmp3lame 3.98-2 with ffmpeg I've googled this error message and found out sometimes it attaches this note "can be treated with downgrade libmp3lame to version 3.97 or 3.98" …
rdate is a Linux command and the network time protocol for immediate date and time setup from another machine. The clocks are synchronized by entering rdate with the -s switch (command without switch just checks the time but does not set anything). …
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.

785 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question