Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

troubleshooting fedora server lockup

Posted on 2011-03-07
10
Medium Priority
?
395 Views
Last Modified: 2012-05-11
I am starting to work with Fedora servers and I am not very well versed in administration of Linux. I had a server that locked up this morning. I could not ping the server and when I got to the monitor there was just a black screen.

I had to do a hard reboot to get the server up and running.

How can I start to troubleshoot the error. I am not sure what logs I would use to start troubleshooting the issue.
0
Comment
Question by:ryan80
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 4
10 Comments
 
LVL 31

Expert Comment

by:farzanj
ID: 35060580
Try looking at

/var/log/messages

/var/log/secure

I am not sure how you got locked.

Did you have a prompt on the console?
0
 
LVL 6

Expert Comment

by:_iskywalker_
ID: 35060690
in /var/log/ there are plenty of logs, you should know what they are, as an admin, so study them!
0
 
LVL 12

Author Comment

by:ryan80
ID: 35061734
thanks for the great response iskywalker. I didnt realize that as an admin I should be familiar with a system and that I would have to review logs.

My inexperience with Linux is why I am asking. Of course I should know what the logs are and what they contain. maybe something a little more constructive like references to where I can find information on logs, or what the basic logs are would be helpfull. I know that browse through a thousand different articles or books and find more information on it, and I will, but I am tyring to troubleshoot an issue now. I posted this question to try and narrow down my search and what I have to reasearch.

@farzanj:
thanks,

There was no prompt at the console. The whole system was unresponsive. I am reviewing those logs now for some hint on what caused the issue.
0
Visualize your virtual and backup environments

Create well-organized and polished visualizations of your virtual and backup environments when planning VMware vSphere, Microsoft Hyper-V or Veeam deployments. It helps you to gain better visibility and valuable business insights.

 
LVL 31

Expert Comment

by:farzanj
ID: 35062072
A little more info would make it easy for me as I want to help!
0
 
LVL 12

Author Comment

by:ryan80
ID: 35062309
Thanks,

I am looking through those logs now.

From what I am seeing so far, is that there is nothing from the time that the system stopped responding until the reboot. Here is what I have so far:

Message

Mar  7 04:18:51 server-in-question snmpd[2118]: Connection from UDP: [xx.x.x.46]:2225
Mar  7 04:18:51 server-in-question snmpd[2118]: Connection from UDP: [xx.x.x.46]:2225
Mar  7 04:18:51 server-in-question snmpd[2118]: Connection from UDP: [xx.x.x.46]:2225
Mar  7 04:18:51 server-in-question snmpd[2118]: Connection from UDP: [xx.x.x.46]:2225
Mar  7 04:19:03 server-in-question mountd[2285]: authenticated mount request from xx.x.x.251:874  or /prod/home/prod (/Production)
Mar  7 04:20:21 server-in-question mountd[2285]: authenticated unmount request from xx.x.x.173:753 for /Dev/home/dev (/Development)
Mar  7 08:47:52 server-in-question kernel: imklog 3.20.2, log source = /proc/kmsg started.
Mar  7 08:47:52 server-in-question rsyslogd: [origin software="rsyslogd" swVersion="3.20.2" x-pid="1885" x-info="http://www.rsyslog.com"] restart

the address at xx.x.x.46 is my monitoring server polling snmp. the last time it showed the server responding is at 4:18. Around 8:47 I powercycled the server.

Here are the logs at that time from secure:

Mar  7 04:19:02 server-in-question sshd[6602]: Connection closed by 127.0.0.1
Mar  7 04:20:02 server-in-question sshd[6613]: Connection closed by 127.0.0.1
Mar  7 08:48:24 server-in-question sshd[2356]: Server listening on 0.0.0.0 port 22.
Mar  7 08:48:24 server-in-question sshd[2356]: Server listening on :: port 22.

0
 
LVL 31

Expert Comment

by:farzanj
ID: 35064293
There are a few things that should be considered.

1.  Fedora is NOT a production brand.  For production use either RedHat (if you can pay) or CentOS (free).  Fedora is a testing distribution to get stable RedHat system.

2.  If you have your server running on run level 5 (GUI), you are asking for trouble.  Linux GUIs are not stable.  Servers should run on run level 3.

3.  Run least number of services.  The services you don't need should not be running on your system.
4.  If possible, enable rsyslog for remote logging.
5.  Try this also
http://linux.about.com/library/cmd/blcmdl1_last.htm
0
 
LVL 12

Author Comment

by:ryan80
ID: 35070823
Thanks for the input.

1. Not that I would know any better, but Fedora was what was being used when I arrived. it is version 9. Not sure why it was picked. Once I know my ass from my elbow with Linux, maybe I can recommend that we move to a different distro.

2. the console does have a graphical screenwhen I get there. I have not tired to log in though. In my limited experience I have always worked through CLI with Linux so I have no problem getting rid of this. is there a way to do this after the fact or is this done in the build of installation?

3. Where can i find the config file that list the services that start?

0
 
LVL 31

Accepted Solution

by:
farzanj earned 2000 total points
ID: 35070894
chkconfig --list

Open in new window


You can also do this:

ls /etc/rc.d/rc3.d/S*

Open in new window

0
 
LVL 31

Expert Comment

by:farzanj
ID: 35070963
If the system totally hangs, there is very little some can do, except for reboot.

I don't know you have sar enabled on your system or not.  It could perhaps tell you about the historical state of your system.
0
 
LVL 12

Author Closing Comment

by:ryan80
ID: 35073478
thanks for all the help.
0

Featured Post

How To Reduce Deployment Times With Pre-Baked AMIs

Even if we can't include all the files in the base image, we can sometimes include some of the larger files that we would otherwise have to download, and we can also sometimes remove the most time-consuming steps. This can help a lot with reducing deployment times.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

If you have a server on collocation with the super-fast CPU, that doesn't mean that you get it running at full power. Here is a preamble. When doing inventory of Linux servers, that I'm administering, I've found that some of them are running on l…
I have seen several blogs and forum entries elsewhere state that because NTFS volumes do not support linux ownership or permissions, they cannot be used for anonymous ftp upload through the vsftpd program.   IT can be done and here's how to get i…
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
If you're a developer or IT admin, you’re probably tasked with managing multiple websites, servers, applications, and levels of security on a daily basis. While this can be extremely time consuming, it can also be frustrating when systems aren't wor…
Suggested Courses

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question