• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 860
  • Last Modified:

Why does my dedicated server crash weekly?

I have a dedicated server running CentOS that is hosted with 1and1.com.

Almost like clockwork the server crashes weekly...usually on a Monday or Friday. I then have to go into the admin recovery area and reboot the server. This is an unmanaged server, so 1and1.com support will not troubleshoot the problem.

I am running about 8 sites on it most of them either Joomla or Wordpress sites. The server uses Plesk as it's control panel and I usually have to reboot the server twice to get Plesk to start working again.

Plesk error:
ERROR: PleskFatalException  

     
Components::componentUpdate() failed: Unable to exec utility packagemng: Empty error message from utility.

--------------------------------------------------------------------------------

0: /usr/local/psa/admin/auto_prepend/auth.php3:530

attached is my log file for the last 24 hours. If someone can look at it and maybe see something I can fix I would appreciate it.
log.txt
0
Donnie Walker
Asked:
Donnie Walker
1 Solution
 
WizRd-LinuxCommented:
The system log will be more useful, can you provide the /var/log/messages file for the same time period.  Also please omit any confidential information where possible.
0
 
Monis MontherSystem ArchitectCommented:
You need to monitor your server, start looking for monitoring tools

check your resourses with top
read your logwatch reports
use sar and nagios

Also check for attacks you might be hacked, install rkhunter this is a tool to check for root kits.
0
 
Donnie WalkerAuthor Commented:
Here is the /var/log/message log file from May1st to May13th.

Last month 1and1.com swapped out the hardware and we reimaged the OS. I then reset all the sites, etc.
For about 2 weeks everything was fine and then it went back to crashing on the weekends.

Yesterday was really bad as I had to reboot the server 4 times before it stopped crashing.
messages.txt
0
Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

 
apexinternetCommented:
It sounds like some sort of hardware problem, even though they already swapped out the hardware.  A crash in Linux is really rare, on good hardware.  If possible, try to narrow down the time it crashes, and look in /var/log/messages as was already mentioned.  You can use a free monitoring service like http://mon.itor.us to help.  If there are entries missing from the log, it's possible that the crash is being caused by a disk problem and the log couldn't be written at the time.

Good luck and let us know if you can narrow down the times.

--
Chris
0
 
Donnie WalkerAuthor Commented:
ok, this morning everything started to slow down. Plesk reported we were using all 160GB of our harddrive space. I rebooted and it went back to normal and shows we were using 1GB of our disc space.

Just now all the sites crashed again. I rebooted again. Attached is today's log file.
messages-may-14.txt
0
 
Donnie WalkerAuthor Commented:
happened again. I may be wrong but it looks like it is doing something with rebuilding the RAID before it dies.

somehow it is filling up the harddrive and this causes the server to crash.
messages-may-15.txt
0
 
apexinternetCommented:
I see the reboots, but there are no visible hardware errors in the log.  The end of this last log shows the building of your RAID devices however it looks like it was doing it during the boot process which is normal.  

Is the symptom just that the server slows down?  If so, are you able to get into the server and run "top"?  This will tell you what process(es) are using up your resources.  This isn't looking like a kernel or hardware problem like I thought.  Plesk is a commercial (not open source) product so who knows if its errors are related to this or not.

--
Chris
0
 
Donnie WalkerAuthor Commented:
the sites start to slow down. anything related to javascript or video disappears...then the server is unresponsive.
0
 
Donnie WalkerAuthor Commented:
any other suggestions?
0
 
apexinternetCommented:
I would wait for it to slow down again, and while it is in that state, I would first make note of the time it happened.  Then, remote in (I am assuming they gave you ssh access?) and run "top".  The processes using the most resources will be listed at the top of the list.  Make note of them, as this will help to troubleshoot further.  If you can post what top shows, this will help pinpoint which process(es) are taking up the resources, and allow us to find out which additional logs need to be checked.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

The 14th Annual Expert Award Winners

The results are in! Meet the top members of our 2017 Expert Awards. Congratulations to all who qualified!

Tackle projects and never again get stuck behind a technical roadblock.
Join Now