Solved

dedicated server goes down for no reason

Posted on 2006-11-17
23
274 Views
Last Modified: 2010-04-20
hello
I have a dedicated server running on my hosting company it has Centos v4.3 installed with all
the updates an everything but the server sometimes goes down for no reason so i have to
send a reboot request! can someone please help or tell me why is this happening?
all the cron jobs are checked already!
0
Comment
Question by:Sabrin
  • 13
  • 6
  • 2
  • +1
23 Comments
 
LVL 12

Expert Comment

by:ibu1
ID: 17970499
What does your /var/log/message contains
0
 

Author Comment

by:Sabrin
ID: 17970506
its empty
0
 

Author Comment

by:Sabrin
ID: 17970513
my /var/log/messages shows me a lot!
what im looking for?
0
 
LVL 12

Expert Comment

by:ibu1
ID: 17970642
->sometimes goes down for no reason means what it hangs or what
Can u post contents here?
look what error it shows before u restart the server
0
 
LVL 43

Expert Comment

by:ravenpl
ID: 17970666
As usually in such cases - have You check Your RAM already? http://www.memtest.org/
0
 

Author Comment

by:Sabrin
ID: 17970679
I only have ssh access to the root how would I test the ram ?
0
 
LVL 43

Expert Comment

by:ravenpl
ID: 17970713
Ask the admin to reboot and test with memtest86 - if it's free...

You may try scripts like http://people.redhat.com/dledford/memtest.html
Another test - simple and efficien is to spawn kernel compilation with unlimited job number. http://linuxmafia.com/faq/VALinux-kb/ram-testing.html
But note: if the test fails, You know the RAM is broken, it the test passes - You know nothing in fact.
On the other hand if memtest86 reports no error for say 1hour You pretty sure it's fine.
0
 

Author Comment

by:Sabrin
ID: 17975571
its not memory!
0
 
LVL 3

Expert Comment

by:talkster5
ID: 17990665
Hi,
Can you open /var/log/messages with nano and then press "ctrl+w", type in restart and enter. This should take you to a line saying the system has restarted and it will have a load of kernel lines below it. If you scroll up a bit and check before the restart you may be able to find some issues as to why the network has cut out.

Thanks.
0
 

Author Comment

by:Sabrin
ID: 17990771
this is what i see

Nov 21 04:03:33 dedicated syslogd 1.4.1: restart.

just lines like that, nothing else about system only syslogd has "restart"
0
 
LVL 3

Expert Comment

by:talkster5
ID: 17990809
Hi,
Could you put a link up for me to download your messages file or something from your server so I can take a proper look at it for you?

Thanks.
0
Comprehensive Backup Solutions for Microsoft

Acronis protects the complete Microsoft technology stack: Windows Server, Windows PC, laptop and Surface data; Microsoft business applications; Microsoft Hyper-V; Azure VMs; Microsoft Windows Server 2016; Microsoft Exchange 2016 and SQL Server 2016.

 

Author Comment

by:Sabrin
ID: 17991090
yes you can download them from here
members.lycos.co.uk/eehost/messages/
0
 

Author Comment

by:Sabrin
ID: 17991099
there was a reboot request in nov 20
0
 

Author Comment

by:Sabrin
ID: 17991103
I gave you the logs from nov19 to nov21
0
 
LVL 3

Expert Comment

by:talkster5
ID: 17991167
Hi,
It looks like there may be an issue with ACPI which I have known to cause problems on a server I had running before. It may be worth recompiling the kernel without support for ACPI and getting it disabled in the bios.

Failing that the only over time I have seen something like you describe is when the memory was all be used and the SWAP was to small causing the server to freeze until the server was rebooted. This may not be the case but it may be something worth checking.

Thanks.
0
 

Author Comment

by:Sabrin
ID: 18001540
hello talkster5,
today nov 23 at 3am the server stoped responding so I sent a reboot
request so they can manually reboot the server. when the server came
back up I copied the file messages and uploaded to the site so you
can please check it one more time to make sure its the ACPI
here: http://members.lycos.co.uk/eehost/messages/
thanks
0
 
LVL 3

Expert Comment

by:talkster5
ID: 18001831
Hi,
It looks like there is something wrong with ACPI but without seeing what is actually before the restart it is hard to tell if that is definatley what the problem is. Could you send me the messages file leading up to the restart as well please.

Thanks.
0
 

Author Comment

by:Sabrin
ID: 18005055
Is this due to the hardware (and/or bios) combination, or is it a bug in the kernel?
0
 
LVL 3

Accepted Solution

by:
talkster5 earned 500 total points
ID: 18005074
ACPI is to do with power management which the operating system controls but it also has to be enabled in the bios to allow the operating system to interact with the hardware as far as I know. This however may not actually be the cause of the problem but I can not give you a definate answer until I see the log from before the restart as this will most likley give the information as to what has changed or gone wrong. The log file after the reboot only shows me that everything has come back online again and is running.
0
 

Author Comment

by:Sabrin
ID: 18005530
is there any way to log everything ?
0
 
LVL 3

Expert Comment

by:talkster5
ID: 18006810
Pretty much everything is already logged either in messages or the applications own log file.

If you are renting this server from someone then it should not really be you that is having to fix the problem though as it has got nothing to do with a configuration change you have made by the looks of things.
0
 

Author Comment

by:Sabrin
ID: 18006823
ok, I have disabled ACPI now lets see if it gets frozen again in this the last 24 hours!
0
 

Author Comment

by:Sabrin
ID: 18031653
its not the ACPI the server keeps getting frozen..
man this sucks
0

Featured Post

Get up to 2TB FREE CLOUD per backup license!

An exclusive Black Friday offer just for Expert Exchange audience! Buy any of our top-rated backup solutions & get up to 2TB free cloud per system! Perform local & cloud backup in the same step, and restore instantly—anytime, anywhere. Grab this deal now before it disappears!

Join & Write a Comment

I. Introduction There's an interesting discussion going on now in an Experts Exchange Group — Attachments with no extension (http://www.experts-exchange.com/discussions/210281/Attachments-with-no-extension.html). This reminded me of questions tha…
The purpose of this article is to demonstrate how we can use conditional statements using Python.
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now