• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 3047
  • Last Modified:

SAR vs. TOP for monitoring CPU

Hi,

We have been monitoring the CPU usage on a system with sar and top.  top seems to give rather spurious results with the CPU idle not apparently relating to the amount of CPU taken up by processes.  See information below - one process is taking up 97% of a CPU, yet the minimum CPU idle is 12%?

Are there anythings to be aware of when reviewing CPU usage using top or sar?  Which is the most accurate?

Many thanks,

Neil

System: hwiodsp1                                      Wed Jun 14 10:07:56 2006
Load averages: 0.41, 0.49, 0.40
227 processes: 196 sleeping, 30 running, 1 zombie
Cpu states:
CPU   LOAD   USER   NICE    SYS   IDLE  BLOCK  SWAIT   INTR   SSYS
 0    0.32   2.6%   0.0%   0.0%  97.4%   0.0%   0.0%   0.0%   0.0%
 1    0.53  50.2%   0.0%   1.0%  48.8%   0.0%   0.0%   0.0%   0.0%
 2    0.52  87.5%   0.0%   0.8%  11.8%   0.0%   0.0%   0.0%   0.0%
 3    0.26  58.3%   0.0%   0.4%  41.3%   0.0%   0.0%   0.0%   0.0%
---   ----  -----  -----  -----  -----  -----  -----  -----  -----
avg   0.41  49.5%   0.0%   0.6%  49.9%   0.0%   0.0%   0.0%   0.0%
 
Memory: 2937204K (1391440K) real, 3136700K (1535900K) virtual, 2023896K free  Page# 1/7
 
CPU TTY     PID USERNAME PRI NI   SIZE    RES STATE    TIME %WCPU  %CPU COMMAND
 2   ?    17221 oracle   241 20  2241M 19132K run     49:06 96.84 96.67 oracleGIOSPROD
 3   ?    28667 oracle   241 20  2241M 19516K run      2:42 43.52 43.45 oracleGIOSPROD
 0   ?     9591 oracle   154 20  2233M 11372K sleep    0:00  4.82  2.30 oracleGIOSPROD
 0   ?     9467 oracle   154 20  2230M  8932K sleep    0:00  1.39  1.39 oracleGIOSPROD
0
neil_mw
Asked:
neil_mw
1 Solution
 
JJSmithCommented:

It's not really a competition, if one was simply better than the other - then we would only have the one!

They WILL demostrate different outputs for the same 'statistic', basically because they measure based on different parameters/algorithms. During their varient calculations, 2 instances ( sar + top ), will not be sampling 'something' concurrently due to natural contention or 'not being in sync' with each other - throw in the fact that the work of one influences the stats of the other and we are almost into chaos theory ;- )

We have to accept they will deliver differences side by side. If I was asked which is the most accurate I would plumb for sar delivering the best 'averages' providing a reasonabley sized interval ( the smaller the interval - the rougher the average ). So it's a good, free, tool for longer term monitoring.

Of course there are 'costed' tools that take all this into account and for a price will provide you with a 3rd set of figures, if run side by side ( sometime they will even discount their own impact of the measured resources ;- )

If we take a 1000th of a second as a clock tick ( really slow these days ), then there are 86,400,000 of them in a day - is the state of the server on one of these ticks really that important?

If we calculate how busy a CPU is and subtracted it from 100 then we would have the idle - but what was it busy doing sys or user, and hang on ..... did it spend any time waiting for I/O ?

If one was simply better than the other - then we would only have the one! - or I have already said that

I don't seek to confuse - but sometimes there are differences .... just because

Cheers
JJ



0
 
gheistCommented:
No system is perfect in such accounting.

uname -a from you will help with system specific interpretaion.
0
 
neil_mwAuthor Commented:
No offence to gheist, however I'll accept the lengthy answer by JJSmith.  Sorry for the delay in replying.
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now