AIX CPU load average

I have some question regarding to CPU load average given from uptime command. We have also drawn up a performance metric on our AIX server over months. We have tabulated CPU load average in 1 minute, 5 minutes and 15 minutes.

Our AIX server has 2 CPUs.


1. What the CPU load average really means? I read some articles. Some refer it as average number of processes on the queue. Some refer to average number of processes that are using the CPU.

2. What contribute to the load average count? Is it just the process in the queue waiting to be executed and processes that are being executing?

3. Our server load average has a minimum value at 0.25, max at 3.7 and average at 0.92. What value should I started to worry?

4. Given we have 2 CPUs should I divided the load_average value by half for the load_average per CPU?

Thank you.
Who is Participating?
woolmilkporcConnect With a Mentor Commented:
To answer this I practically need to know what you mean with "2 CPUs". Cores? Virtual Processing Units? Logical Processors?

As I wrote, if the average load doesn't exceed the number of Logical CPUs there is absolutely nothing to worry in regard of this load - but maybe in some other regard.

Load is only one criterion, and not the best. There is Wait for I/O and CPU % busy, for example.

Assuming you had only few, but heavy threads running, your load will be low, but Utilization will be high.
If this leads to performance problems, adding more virtual CPUs will have no effect in such a scenario. Here the processors need to be faster, which you could achieve by adding more processing units (if LPAR), which will make a single virtual CPU logically faster, or by purchasing a server with faster processors, if it's a standalone (not virtualized) server.

On the other hand, if you have lots of small threads to process, your load could be high, while Utilization will be rather low. The same is true if your machine has to wait for I/O.

That's to say, Average Load is not a good indicator.

Check lparstat (if LPAR). Under "physc" you will see how many real CPU cores are being used. Note that this number cannot be higher than the number of virtual CPUs! Another important value is "%entc" which will show you how much of its entitled capacity the LPAR consumes (can/will be more than 100 if LPAR is uncapped). Add more processing units if this value is most of the time near 100 (capped) or way above (if uncapped).

There is much more to say about AIX tuning. Maybe you'd like to read this:


These are the averages of the number of processes in the run queue waiting for the cpu.
1. Its just the avarage number of processes waiting on the run queue.
2. Same as above
3. Ideally, if the value is more than 1, that means.. on an average, there is atleast one thread waiting for the cpu. How much is worrying really depends on what are you expectations....the numbers that you gave look ok to me.
4. No you should not divide by two.. the value given are system wide average.... so it would have already take into account the number of cpus.
The 14th Annual Expert Award Winners

The results are in! Meet the top members of our 2017 Expert Awards. Congratulations to all who qualified!

1) My apologies for contradicting, but load is the number of threads waiting to be dispatched plus the number of threads runnung on a CPU.
2) see above. Load rises with many threads being spawned in short intervals.
3) If it's AIX, what version? From AIX 5.3 running om Power5 processors there is SMT (Simultaneous Multi-Threading), meaning that one single CPU core can store the status of and share components (ALU, FPU, ...) between two threads. So there is no reason to worry about a load equalling the number of logical CPUs, which is, given AIX 5.3 and Power5, twice the number of cores in the system (or Virtual Processing Units in an LPAR).
4) The run queue is (from a logical perspective) a system wide queue. See my calculation under (3). Thus a "load average per CPU" doesn't make sense.


tommym121Author Commented:
Hi, we are on AIX 6.1

Back to my question #3. At what load_average should I start to worry? Is there any guidance how to determine whether we have enough CPU based on the load_average indicator? We are looking for some indicators whether our current server capacity is able to handle additional workload. We will add more CPU if necessary. Just want to know can we determine it with the indicator of load average?

Thank you.
sjm_eeConnect With a Mentor Commented:
Q At what load_average should I start to worry?

A Although you are to be commended for measuring what a "healthy" system looks like in terms of performance metrics, the metrics alone are not a useful test of whether your system is "healthy" or "not healthy" from the perspective of performance monitoring. What you are actually interested in is the performance of the application or applications that the system is running. Keep track of that as your primary objective. From the perspective of performance capacity planning, you should keep this historical data and use tooling that enables you to analyze the growth of your workload over time and plan accordingly.
tommym121Author Commented:
Thank you guys.
All Courses

From novice to tech pro — start learning today.