?
Solved

How do I find the cause of high load on Centos 5.4

Posted on 2010-04-06
10
Medium Priority
?
363 Views
Last Modified: 2013-12-16
Server is running Centos 5.4 (final) with 8Gb ram 2 x Core udo 2.8Ghz software raid configured.

Every 1 hour the loading spikes from 0.16 up to 4.04 and locks up everything for around 5-65 minutes. There is no noticable change in CPU or memory just the increase in loading. "top" does not reveal which app is causing the problem.

Any ideas on how to identify the culprit?
0
Comment
Question by:chrisk61
10 Comments
 
LVL 7

Expert Comment

by:ClintSwiney
ID: 29930488
Look at the cron jobs that are set to execute hourly. Disable them one by one and see if the problem goes away.
0
 

Author Comment

by:chrisk61
ID: 29931556
Hi thanks - this is all in crontab -e:
There's nothing hourly

*/5     *       *       *       *       /usr/bin/srvmonitor >/dev/null 2>&1
57      3       *       *       *       /usr/local/xxxxxxx/bin/notifications.sh >/dev/null 2>&1
14      4       *       *       *       /usr/local/xxxxxxx/bin/suspend.sh >/dev/null 2>&1
57      4       *       *       *       /usr/local/xxxxxxx/admin/sbin/run_updater >/dev/null 2>&1
15      5       *       *       *       /usr/local/xxxxxxx/bin/dbdump >/dev/null 2>&1
12      6       *       *       *       /usr/local/xxxxxxx/admin/sbin/rmvoicefax >/dev/null 2>&1
*/2     *       *       *       *       /usr/local/xxxxxxx/livemonitor/bin/livemonitor >/dev/null 2>&1
14      5       *       *       *       /usr/local/xxxxxxx/bin/cleantmp.sh >/dev/null 2>&1
01      6       *       *       *       /usr/local/xxxxxxx/admin/sbin/mngquota --action set --all --quiet >/dev/null 2>&1
*/5     *       *       *       *       /usr/local/xxxxxxx/bin/faxpreapproved.sh >/dev/null 2>&1
0
 
LVL 4

Expert Comment

by:pawwa
ID: 29931710
Also look for those processes that could produce a lot of disk activity (maybe some log parsing, webalizer or whatever).

With "top" you can look at "%wa" which shows you what percentage of time is your CPU idle because of extensive IO requests, thus getting the load high.
0
The new generation of project management tools

With monday.com’s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.

 

Author Comment

by:chrisk61
ID: 29932799
Just checked %wa it ranges from 25% to 45% when the loading spikes, it is back down to 0.0% now.
0
 
LVL 3

Expert Comment

by:deepak_iq
ID: 29944351

Download the attached nmon.zip to your linux system.

Unzip it.

It contains two files.

Copy those files to /usr/local/bin
#chmod 700 nmon start_nmon

Add an entry in crontab:
1 0 * * *       /usr/local/bin/start_nmon 300 300

#/usr/local/bin/start_nmon 300 300

If the server reboot,the above step needs to be started manually.

It's output will be in directory /opt/nmon11b/output

To view it's output,download the nmon file(s) to your windows desktop.
Run nmon analyser and you will get .xls output for the whole range of specifications.

Kindly let me know in case it didn't work out.
nmon.zip
nmon-analyser.zip
0
 

Author Comment

by:chrisk61
ID: 29947814
OK followed above and got an error:

nohup: cannot run command `/usr/local/bin/chk_tcp': No such file or directory

I can see output in "mydomain.nmon"
but not in
"mydomain.-chk_tcp-100406_2131.log" and not in "mydomain_100406_2131.nmon"

Can you confirm what I should be looking for please?
0
 
LVL 3

Expert Comment

by:deepak_iq
ID: 29949109
Add following file in /usr/local/bin:

vi chk_tcp
#!/bin/bash

#set -x

# Check command line arguments.

if [ "$#" -eq "0" ] ; then
   echo  Error:   Missing parameter.
   echo "Usage:\n\t$(basename $0) "
   echo  "Example:\n\t$(basename $0) 5"
   exit 1
fi


sleep=$1

x=0
while [ $x -lt 1 ]
do
http="none"
http=`ps -ef | grep httpd |wc -l`

netstat -an >/tmp/netstat.txt
EST=`grep -i ESTABLISH /tmp/netstat.txt | wc -l`
TW=`grep -i TIME_WAIT /tmp/netstat.txt | wc -l`
CW=`grep -i CLOSE_WAIT /tmp/netstat.txt | wc -l`
FIN=`grep -i FIN /tmp/netstat.txt | wc -l`

#echo `date` TCP connection info for `hostname`
echo `date` `hostname`  EST:$EST TW:$TW CW:$CW FIN:$FIN http:$http
sleep $sleep
done

#chmod +x /usr/local/bin/chk_tcp

And then execute :

#/usr/local/bin/start_nmon 300 300

To find nmon running in server or not:

# ps -eaf|grep nmon
0
 
LVL 3

Expert Comment

by:deepak_iq
ID: 29949335
Before running nmon manually,make sure you kill already running any previous nmon process.
0
 

Author Comment

by:chrisk61
ID: 29951717
Thanks this works fine but can you confirm what I am supposed to be looking at please?
As a reminder I am trying to find the cause of the hourly server loading peaks.
0
 
LVL 3

Accepted Solution

by:
deepak_iq earned 500 total points
ID: 29987589

Ok,first hurdle clear.

Now, you ftp the nmon file to your local desktop.

Open nmon_analyser xls sheet and there one option is there : Analyse nmon data.

Once you select that it will ask you for nmon file,select the nmon file location . This will automatically create one xls file output ,which you can save by any name on a specified location on your pc.

Regarding what information we gather from it.

We can gather whole lot of information from each section of that excel sheet. Particular ones in this case will be to see which processes are consuming max. memory,what is cpu utilization,what are disk i/o have been for all the disks during that duration.

For more understanding you can visit IBM's website for nmon-analyser and also I had attached one doc for understanding the various terms for the output you have received.

If you still face any difficulty,send me the nmon file and I will let you know in general what information we have captured from there.

Hope this will resolve your query.
0

Featured Post

Take Control of Web Hosting For Your Clients

As a web developer or IT admin, successfully managing multiple client accounts can be challenging. In this webinar we will look at the tools provided by Media Temple and Plesk to make managing your clients’ hosting easier.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In part one, we reviewed the prerequisites required for installing SQL Server vNext. In this part we will explore how to install Microsoft's SQL Server on Ubuntu 16.04.
Often times it's very very easy to extend a volume on a Linux instance in AWS, but impossible to shrink it. I wanted to contribute to the experts-exchange community a way of providing a procedure that works on an AWS instance. It can also be used on…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…
Suggested Courses

601 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question