dedicated server goes down for no reason

hello
I have a dedicated server running on my hosting company it has Centos v4.3 installed with all
the updates an everything but the server sometimes goes down for no reason so i have to
send a reboot request! can someone please help or tell me why is this happening?
all the cron jobs are checked already!
SabrinAsked:
Who is Participating?
 
talkster5Connect With a Mentor Commented:
ACPI is to do with power management which the operating system controls but it also has to be enabled in the bios to allow the operating system to interact with the hardware as far as I know. This however may not actually be the cause of the problem but I can not give you a definate answer until I see the log from before the restart as this will most likley give the information as to what has changed or gone wrong. The log file after the reboot only shows me that everything has come back online again and is running.
0
 
ibu1System AdministratorCommented:
What does your /var/log/message contains
0
 
SabrinAuthor Commented:
its empty
0
Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

 
SabrinAuthor Commented:
my /var/log/messages shows me a lot!
what im looking for?
0
 
ibu1System AdministratorCommented:
->sometimes goes down for no reason means what it hangs or what
Can u post contents here?
look what error it shows before u restart the server
0
 
ravenplCommented:
As usually in such cases - have You check Your RAM already? http://www.memtest.org/
0
 
SabrinAuthor Commented:
I only have ssh access to the root how would I test the ram ?
0
 
ravenplCommented:
Ask the admin to reboot and test with memtest86 - if it's free...

You may try scripts like http://people.redhat.com/dledford/memtest.html
Another test - simple and efficien is to spawn kernel compilation with unlimited job number. http://linuxmafia.com/faq/VALinux-kb/ram-testing.html
But note: if the test fails, You know the RAM is broken, it the test passes - You know nothing in fact.
On the other hand if memtest86 reports no error for say 1hour You pretty sure it's fine.
0
 
SabrinAuthor Commented:
its not memory!
0
 
talkster5Commented:
Hi,
Can you open /var/log/messages with nano and then press "ctrl+w", type in restart and enter. This should take you to a line saying the system has restarted and it will have a load of kernel lines below it. If you scroll up a bit and check before the restart you may be able to find some issues as to why the network has cut out.

Thanks.
0
 
SabrinAuthor Commented:
this is what i see

Nov 21 04:03:33 dedicated syslogd 1.4.1: restart.

just lines like that, nothing else about system only syslogd has "restart"
0
 
talkster5Commented:
Hi,
Could you put a link up for me to download your messages file or something from your server so I can take a proper look at it for you?

Thanks.
0
 
SabrinAuthor Commented:
yes you can download them from here
members.lycos.co.uk/eehost/messages/
0
 
SabrinAuthor Commented:
there was a reboot request in nov 20
0
 
SabrinAuthor Commented:
I gave you the logs from nov19 to nov21
0
 
talkster5Commented:
Hi,
It looks like there may be an issue with ACPI which I have known to cause problems on a server I had running before. It may be worth recompiling the kernel without support for ACPI and getting it disabled in the bios.

Failing that the only over time I have seen something like you describe is when the memory was all be used and the SWAP was to small causing the server to freeze until the server was rebooted. This may not be the case but it may be something worth checking.

Thanks.
0
 
SabrinAuthor Commented:
hello talkster5,
today nov 23 at 3am the server stoped responding so I sent a reboot
request so they can manually reboot the server. when the server came
back up I copied the file messages and uploaded to the site so you
can please check it one more time to make sure its the ACPI
here: http://members.lycos.co.uk/eehost/messages/
thanks
0
 
talkster5Commented:
Hi,
It looks like there is something wrong with ACPI but without seeing what is actually before the restart it is hard to tell if that is definatley what the problem is. Could you send me the messages file leading up to the restart as well please.

Thanks.
0
 
SabrinAuthor Commented:
Is this due to the hardware (and/or bios) combination, or is it a bug in the kernel?
0
 
SabrinAuthor Commented:
is there any way to log everything ?
0
 
talkster5Commented:
Pretty much everything is already logged either in messages or the applications own log file.

If you are renting this server from someone then it should not really be you that is having to fix the problem though as it has got nothing to do with a configuration change you have made by the looks of things.
0
 
SabrinAuthor Commented:
ok, I have disabled ACPI now lets see if it gets frozen again in this the last 24 hours!
0
 
SabrinAuthor Commented:
its not the ACPI the server keeps getting frozen..
man this sucks
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.