Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

AIX CPU load average

Posted on 2010-08-27
7
Medium Priority
?
5,977 Views
Last Modified: 2012-05-10
I have some question regarding to CPU load average given from uptime command. We have also drawn up a performance metric on our AIX server over months. We have tabulated CPU load average in 1 minute, 5 minutes and 15 minutes.

Our AIX server has 2 CPUs.

Questions:

1. What the CPU load average really means? I read some articles. Some refer it as average number of processes on the queue. Some refer to average number of processes that are using the CPU.

2. What contribute to the load average count? Is it just the process in the queue waiting to be executed and processes that are being executing?

3. Our server load average has a minimum value at 0.25, max at 3.7 and average at 0.92. What value should I started to worry?

4. Given we have 2 CPUs should I divided the load_average value by half for the load_average per CPU?

Thank you.
0
Comment
Question by:tommym121
  • 2
  • 2
  • 2
  • +1
7 Comments
 
LVL 5

Expert Comment

by:shajithchandran
ID: 33544577
These are the averages of the number of processes in the run queue waiting for the cpu.
0
 
LVL 5

Expert Comment

by:shajithchandran
ID: 33544746
1. Its just the avarage number of processes waiting on the run queue.
2. Same as above
3. Ideally, if the value is more than 1, that means.. on an average, there is atleast one thread waiting for the cpu. How much is worrying really depends on what are you expectations....the numbers that you gave look ok to me.
4. No you should not divide by two.. the value given are system wide average.... so it would have already take into account the number of cpus.
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 33545464
1) My apologies for contradicting, but load is the number of threads waiting to be dispatched plus the number of threads runnung on a CPU.
2) see above. Load rises with many threads being spawned in short intervals.
3) If it's AIX, what version? From AIX 5.3 running om Power5 processors there is SMT (Simultaneous Multi-Threading), meaning that one single CPU core can store the status of and share components (ALU, FPU, ...) between two threads. So there is no reason to worry about a load equalling the number of logical CPUs, which is, given AIX 5.3 and Power5, twice the number of cores in the system (or Virtual Processing Units in an LPAR).
4) The run queue is (from a logical perspective) a system wide queue. See my calculation under (3). Thus a "load average per CPU" doesn't make sense.

wmp



0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 

Author Comment

by:tommym121
ID: 33545569
Hi, we are on AIX 6.1

Back to my question #3. At what load_average should I start to worry? Is there any guidance how to determine whether we have enough CPU based on the load_average indicator? We are looking for some indicators whether our current server capacity is able to handle additional workload. We will add more CPU if necessary. Just want to know can we determine it with the indicator of load average?

Thank you.
0
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 1200 total points
ID: 33546127
To answer this I practically need to know what you mean with "2 CPUs". Cores? Virtual Processing Units? Logical Processors?

As I wrote, if the average load doesn't exceed the number of Logical CPUs there is absolutely nothing to worry in regard of this load - but maybe in some other regard.

Load is only one criterion, and not the best. There is Wait for I/O and CPU % busy, for example.

Assuming you had only few, but heavy threads running, your load will be low, but Utilization will be high.
If this leads to performance problems, adding more virtual CPUs will have no effect in such a scenario. Here the processors need to be faster, which you could achieve by adding more processing units (if LPAR), which will make a single virtual CPU logically faster, or by purchasing a server with faster processors, if it's a standalone (not virtualized) server.

On the other hand, if you have lots of small threads to process, your load could be high, while Utilization will be rather low. The same is true if your machine has to wait for I/O.

That's to say, Average Load is not a good indicator.

Check lparstat (if LPAR). Under "physc" you will see how many real CPU cores are being used. Note that this number cannot be higher than the number of virtual CPUs! Another important value is "%entc" which will show you how much of its entitled capacity the LPAR consumes (can/will be more than 100 if LPAR is uncapped). Add more processing units if this value is most of the time near 100 (capped) or way above (if uncapped).

There is much more to say about AIX tuning. Maybe you'd like to read this:

http://www.ibm.com/developerworks/aix/library/au-aix5_cpu/index.html
http://www.ibm.com/developerworks/aix/library/au-aix6tuning/

wmp





1
 
LVL 14

Assisted Solution

by:sjm_ee
sjm_ee earned 800 total points
ID: 33548855
Q At what load_average should I start to worry?

A Although you are to be commended for measuring what a "healthy" system looks like in terms of performance metrics, the metrics alone are not a useful test of whether your system is "healthy" or "not healthy" from the perspective of performance monitoring. What you are actually interested in is the performance of the application or applications that the system is running. Keep track of that as your primary objective. From the perspective of performance capacity planning, you should keep this historical data and use tooling that enables you to analyze the growth of your workload over time and plan accordingly.
0
 

Author Closing Comment

by:tommym121
ID: 33559730
Thank you guys.
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When you do backups in the Solaris Operating System, the file system must be inactive. Otherwise, the output may be inconsistent. A file system is inactive when it's unmounted or it's write-locked by the operating system. Although the fssnap utility…
Let's say you need to move the data of a file system from one partition to another. This generally involves dismounting the file system, backing it up to tapes, and restoring it to a new partition. You may also copy the file system from one place to…
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:

927 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question