Load average higher than cpu Percentage?

Greetings,

I don't know what is going on with one of my servers.

It is sitting with a load average of 10.09 10.08 10.02

However If I look at top I see the server between 95% and 99% idle!

Is it possible that my server is running some hidden processes or something?

Please tell me how I can investigate the cause of this.
LVL 8
brittonvAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

rkursemCommented:
To my knowledge, load average (rup) just reports the average number of different processes in the last 1,5 and 15 minutes.

Thus, it is possible to see high load averages even and high idle time if there is a lot of process switching going on. For instance, processes that themselves hand over control of the CPU - I think the function call is "yield()" AFAIR.
0
ajay_mhasalCommented:
HI,

You Load Average also depends upon the Disk IO wait (As cpu works much faster than Disk speed and slow disk IO can cause cpu to sit idle)  and i think in your case disk IO wait causing the High Load Average, Just recheck the percentage of value "% wa" in top commands output if it high then you are facing IO bottleneck and you need to go for higher disk IO like SCSI or Fibre channel.

You can use iostat command to check further details about your Disk IO usage

 eg.

# iostat -d -x 5 3
0
brittonvAuthor Commented:
I don't have iostat installed but here is the output from top
#####-03:~ # top c
top - 12:07:10 up 233 days, 12:45,  3 users,  load average: 11.01, 11.02, 11.00
Tasks: 394 total,   1 running, 393 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.3% us,  0.7% sy,  0.0% ni, 98.3% id,  0.3% wa,  0.0% hi,  0.3% si
Mem:   2055984k total,  1907196k used,   148788k free,    80756k buffers
Swap:  1028120k total,  1024776k used,     3344k free,   869980k cached

wa seems ok, right?
0
10 Tips to Protect Your Business from Ransomware

Did you know that ransomware is the most widespread, destructive malware in the world today? It accounts for 39% of all security breaches, with ransomware gangsters projected to make $11.5B in profits from online extortion by 2019.

ajay_mhasalCommented:
HI,

Memory is the bottelneck in your case, You need to upgrade the RAM as your swap usage is too High.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
joolsSenior Systems AdministratorCommented:
I'm not convinced that looking at a one listing of the head of a top output I could definately say memory is a bottleneck because of swap usage but if memory on the server is low then it should perhaps be addressed.

You should really look at the server over a period of time using sar, top, iostat, vmstat tools (procps and sysstat packages). Using ps also shows quite a bit of info.
0
furettoCommented:
I agree with jools concerning the memory issue.
Memory could be the issue but I would not say Im positive about it from the top output.
vmstat and iostat should help together with # mpstat  -P  ALL 2 10 (show all processors  usage every 2 second for 10 times)

Also, check if the first process shown on top are eating memory or cpu executing #top and keeping an eye on it. (hit "q" to quit)
0
jdarwinCommented:
In your output's I saw load average of 10 and 11. How many CPU's do you have in that box?

The maximum value of load average should be equal to the number of CPU's in the server.

For a 1 CPU box,  load average  >= 0.4 is overloaded, please correct me, if anybody knows.

So a value of 1 for load average in single CPU system, means the system is being utilized fully. If this continues all the time, it is time to increase your system CPU count and RAM.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Linux

From novice to tech pro — start learning today.