Solved

AIX CPU load average

Posted on 2010-08-27
7
4,884 Views
Last Modified: 2012-05-10
I have some question regarding to CPU load average given from uptime command. We have also drawn up a performance metric on our AIX server over months. We have tabulated CPU load average in 1 minute, 5 minutes and 15 minutes.

Our AIX server has 2 CPUs.

Questions:

1. What the CPU load average really means? I read some articles. Some refer it as average number of processes on the queue. Some refer to average number of processes that are using the CPU.

2. What contribute to the load average count? Is it just the process in the queue waiting to be executed and processes that are being executing?

3. Our server load average has a minimum value at 0.25, max at 3.7 and average at 0.92. What value should I started to worry?

4. Given we have 2 CPUs should I divided the load_average value by half for the load_average per CPU?

Thank you.
0
Comment
Question by:tommym121
  • 2
  • 2
  • 2
  • +1
7 Comments
 
LVL 5

Expert Comment

by:shajithchandran
ID: 33544577
These are the averages of the number of processes in the run queue waiting for the cpu.
0
 
LVL 5

Expert Comment

by:shajithchandran
ID: 33544746
1. Its just the avarage number of processes waiting on the run queue.
2. Same as above
3. Ideally, if the value is more than 1, that means.. on an average, there is atleast one thread waiting for the cpu. How much is worrying really depends on what are you expectations....the numbers that you gave look ok to me.
4. No you should not divide by two.. the value given are system wide average.... so it would have already take into account the number of cpus.
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 33545464
1) My apologies for contradicting, but load is the number of threads waiting to be dispatched plus the number of threads runnung on a CPU.
2) see above. Load rises with many threads being spawned in short intervals.
3) If it's AIX, what version? From AIX 5.3 running om Power5 processors there is SMT (Simultaneous Multi-Threading), meaning that one single CPU core can store the status of and share components (ALU, FPU, ...) between two threads. So there is no reason to worry about a load equalling the number of logical CPUs, which is, given AIX 5.3 and Power5, twice the number of cores in the system (or Virtual Processing Units in an LPAR).
4) The run queue is (from a logical perspective) a system wide queue. See my calculation under (3). Thus a "load average per CPU" doesn't make sense.

wmp



0
Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

 

Author Comment

by:tommym121
ID: 33545569
Hi, we are on AIX 6.1

Back to my question #3. At what load_average should I start to worry? Is there any guidance how to determine whether we have enough CPU based on the load_average indicator? We are looking for some indicators whether our current server capacity is able to handle additional workload. We will add more CPU if necessary. Just want to know can we determine it with the indicator of load average?

Thank you.
0
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 300 total points
ID: 33546127
To answer this I practically need to know what you mean with "2 CPUs". Cores? Virtual Processing Units? Logical Processors?

As I wrote, if the average load doesn't exceed the number of Logical CPUs there is absolutely nothing to worry in regard of this load - but maybe in some other regard.

Load is only one criterion, and not the best. There is Wait for I/O and CPU % busy, for example.

Assuming you had only few, but heavy threads running, your load will be low, but Utilization will be high.
If this leads to performance problems, adding more virtual CPUs will have no effect in such a scenario. Here the processors need to be faster, which you could achieve by adding more processing units (if LPAR), which will make a single virtual CPU logically faster, or by purchasing a server with faster processors, if it's a standalone (not virtualized) server.

On the other hand, if you have lots of small threads to process, your load could be high, while Utilization will be rather low. The same is true if your machine has to wait for I/O.

That's to say, Average Load is not a good indicator.

Check lparstat (if LPAR). Under "physc" you will see how many real CPU cores are being used. Note that this number cannot be higher than the number of virtual CPUs! Another important value is "%entc" which will show you how much of its entitled capacity the LPAR consumes (can/will be more than 100 if LPAR is uncapped). Add more processing units if this value is most of the time near 100 (capped) or way above (if uncapped).

There is much more to say about AIX tuning. Maybe you'd like to read this:

http://www.ibm.com/developerworks/aix/library/au-aix5_cpu/index.html
http://www.ibm.com/developerworks/aix/library/au-aix6tuning/

wmp





0
 
LVL 14

Assisted Solution

by:sjm_ee
sjm_ee earned 200 total points
ID: 33548855
Q At what load_average should I start to worry?

A Although you are to be commended for measuring what a "healthy" system looks like in terms of performance metrics, the metrics alone are not a useful test of whether your system is "healthy" or "not healthy" from the perspective of performance monitoring. What you are actually interested in is the performance of the application or applications that the system is running. Keep track of that as your primary objective. From the perspective of performance capacity planning, you should keep this historical data and use tooling that enables you to analyze the growth of your workload over time and plan accordingly.
0
 

Author Closing Comment

by:tommym121
ID: 33559730
Thank you guys.
0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

This tech tip describes how to install the Solaris Operating System from a tape backup that was created using the Solaris flash archive utility. I have used this procedure on the Solaris 8 and 9 OS, and it shoudl also work well on the Solaris 10 rel…
Java performance on Solaris - Managing CPUs There are various resource controls in operating system which directly/indirectly influence the performance of application. one of the most important resource controls is "CPU".   In a multithreaded…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
In a previous video, we went over how to export a DynamoDB table into Amazon S3.  In this video, we show how to load the export from S3 into a DynamoDB table.

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now