Link to home
Start Free TrialLog in
Avatar of Julian Matz
Julian MatzFlag for Ireland

asked on

Linux server keeps "crashing"

I have a Debian Linux server that has started becoming unavailable/unresponsive as of about 2 weeks ago. In my experience, this is usually caused by a high server load, caused by Apache, a poorly written PHP script, a corrupt database, or sometimes a disk I/O related issue. In this case, though, this doesn't appear to be the case. I installed various utilities to log and warn about high server loads. One example is the sysstat utility. According to these, server load was not the issue.

It was also not a network issue, since, for example, the system log stopped logging at the times the server went down. If it was just a network problem, the system log would have continued to log.

I also couldn't find anything useful in syslog.

Here's an example of what my server load average looked like during the last "crash" (the server went down just after 13:35 or 13:36, and was restarted at 15:14):

# sar -q -f /var/log/sysstat/sa02 -s 13:00:01
Linux 2.6.26-2-686      01/02/13        _i686_

13:05:01      runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15
13:15:01            2       153      0.13      0.06      0.01
13:25:02            0       146      0.07      0.18      0.11
13:35:01            3       147      0.12      0.11      0.09
Average:            2       149      0.11      0.12      0.07

15:14:40          LINUX RESTART

15:15:01      runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15
15:25:01            2       152      0.14      0.56      0.50
15:35:01            2       152      0.00      0.10      0.26
15:45:01            2       145      0.02      0.11      0.18
15:55:01            1       157      0.10      0.05      0.10
16:05:01            1       156      1.32      0.99      0.48
16:15:01            1       173      0.72      1.09      0.81
16:25:01            2       151      0.10      0.23      0.46
16:35:01            3       145      0.00      0.04      0.24
16:45:01            2       169      0.15      0.61      0.45
16:55:01            2       169      0.18      0.18      0.27
17:05:01            2       161      0.08      0.30      0.30
17:15:01            4       162      0.23      0.31      0.29
17:25:01            2       164      0.03      0.08      0.16
17:35:01            2       165      0.06      0.05      0.09
17:45:01            2       168      0.00      0.02      0.05
17:55:01            2       164      0.03      0.10      0.08
18:05:01            2       168      0.15      0.21      0.13
Average:            2       160      0.19      0.30      0.29

Open in new window


I'm wondering if someone might be able to help me identify the problem. I realise there could be many possibilities, but a couple of starting points would be good.

Many thanks!
ASKER CERTIFIED SOLUTION
Avatar of farzanj
farzanj
Flag of Canada image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Julian Matz

ASKER

Nothing in logs, but the hardware, bar the hard drive, was replaced, and I haven't had any crashes since. Not sure was the motherboard replaced, actually. I was guessing it could have been the CPU, but I could be wrong; no way to know for sure now, but the main thing is that it's fixed. Thanks for your help/suggestions.