AIX 5.1 -- What is the load average listed by "uptime"?

The load average listed by uptime, does not seem to be directly related to the idle time listed in topas.  

I have idle lingering between 0-20 in topas, and uptime is still only giving a 4 load average.

Basically -- I am trying to figure out if the system can handle 70 more telnet sessions.... we do interviewing, and have almost 300 interviewer seats now.  and I am trying to figure out if I think it could handle another 70.



XetroximynAsked:
Who is Participating?
 
woolmilkporcConnect With a Mentor Commented:
The load average is the number of runnable processes over the preceding 1-, 5-, 15-minute intervals.
It's not really related to idle time.
If you have few heavy threads the load might be low, but CPU utilization might be high (giving low idle time).
On the other hand, if you have an application firing up many small, short living and inexpensive threads, load might be elevated, with low CPU utilization (and high idle time) nonetheless.
With a high I/O wait you will have a high load and much idle time as well. The same is true for a situation with high paging I/O.
As for the telnet sessions - the sessions themselves will not be a problem. The question is - what applications are started from those sessions, how much CPU will they need, how many child processes/threads will those applications launch?
And for topas - if you're running an  LPAR in shared processor mode, this value is misleading. In such a scenario topas will always show low idle time.  because the partition will cede its unused CPU share to other partitions.
If it's not an LPAR - idle% 0-20 is not very much. How high is your I/O wait? If this value is also low, I fear your machine will not be able to support lots of additional applications.
wmp
 
 
0
 
XetroximynAuthor Commented:
How do I find out if it is LPAR?

 
I watched topas for a few minutes last night when about half the ports in use.
Below are the ranges I was seeing.
   
Kernal = 8-25  
User = 25-80  
Wait 0-81  
Idle 0-65  
   
For a good while Idle was mostly lingering below 10-20  
0
 
woolmilkporcConnect With a Mentor Commented:
Issue

uname -L

If you get "-1 NULL" or "1 NULL" this is not an LPAR.

If no LPAR (which I assume with AIX 5.1),  idle% below 10 means that there is not much reserve regarding CPU capacity.

I'd suggest not to launch all 70 additional sessions at once.

Rather increase the load step by step while carefully watching your system. And ask your interviewers about the performance and response times of their application. Ask more than a few people, and ask at various times of day.

But (to repeat what I already wrote) - all depends on what will happen during those telnet sessions. How heavy is your "interview" application? Is a database involved? Does it generate much I/O traffic, maybe due to queries? Are there many calculations to do?

wmp
0
Introducing Cloud Class® training courses

Tech changes fast. You can learn faster. That’s why we’re bringing professional training courses to Experts Exchange. With a subscription, you can access all the Cloud Class® courses to expand your education, prep for certifications, and get top-notch instructions.

 
XetroximynAuthor Commented:
I think a lot of the stuff programmers/project managers do on the system spikes up the CPU a lot.  (running reports etc)

I looked at the system while we were nearly full, and late enough that there probably was not much else being done on the system beside interviewing.   It still spiked up, but it was generally 20-40 idle a lot -- spiking up to 60 or so idle sometimes.  There were a couple periods of 10 or so seconds sustained at 50-60 idle.

I feel like maybe the interviewing is using 40-50% -- and other stuff (reports being run, etc) causes all the other less constant CPU usage. So that other stuff might just take longer to run if interviewing starts taking 60-70%.  

Do you have any thoughts?


If you are curious -- here is 5 minutes of topas.
http://screencast.com/t/ODdmNzM4MW


0
 
woolmilkporcConnect With a Mentor Commented:
Yep,
interesting stuff.
It seems that the idle% going down was in most cases due to wait% going up, which would indicate that your system is somewhat I/O constrained.
Your disk paths seem rather well balanced, so I don't think there are some real hotspots which could be spread across more disks.
Your users/jobs do produce peaks in CPU load, but the average percentage seems to stay below 50%.
I think in case you're actually running into performance problems with the new sessions moving to faster disks, maybe a SAN box with some (more) cache available will have better effect than changing CPUs.
Should you consider changing server hardware nonetheless, better think of faster than of more processors.
Maybe moving you report jobs (which might be I/O intensive) off-shift could also help a lot!
wmp
0
 
XetroximynAuthor Commented:
Thanks!
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.