questil
asked on
AIX 6.1 very slow
I have AIX 6.1 machine that is occasionally became very slow.
There is DB2 installed on this machine, but running topas show no obvious reason for this slowness.
CPU: is above 95% idle on both CPU’s.
Network: usually less than 10kbps load.
Disks: less than 5% busy most of the time.
Memory: from 16GB 27.7% comp, 8.8% noncomp and 8.8% client
Paging: size 4GB 100% free
Any idea how do I find what’s causing the system to be almost unresponsive? only after reboot we can work with the system until the next time…
Thanks,
Tal
Do you use EXTSHM with DB2?
If so this article could be of interest to you:
https://www-304.ibm.com/support/docview.wss?rs=71&uid=swg21191295&wv=1
wmp
If so this article could be of interest to you:
https://www-304.ibm.com/support/docview.wss?rs=71&uid=swg21191295&wv=1
wmp
ASKER
The slowness is with responding to initial ssh connections and shell responsive to any command.
We do not use EXTSHM with DB2.
We do not use EXTSHM with DB2.
Is this an LPAR?
If so, could it be that the whole managed system is overloaded from time to time?
How high is the "load" anyway?
How many processes are running? Perhaps lots of zombies?
Any messages in the errorlog (errpt)?
Is your DNS (if any) always responsive?
How much free space is in /, /var and /tmp?
wmp
If so, could it be that the whole managed system is overloaded from time to time?
How high is the "load" anyway?
How many processes are running? Perhaps lots of zombies?
Any messages in the errorlog (errpt)?
Is your DNS (if any) always responsive?
How much free space is in /, /var and /tmp?
wmp
ASKER
It's a real machine not LPAR.
The load is very low: (during system slowness)
CPU: is above 95% idle on both CPU’s.
Network: usually less than 10kbps load.
Disks: less than 5% busy most of the time.
Memory: from 16GB 27.7% comp, 8.8% noncomp and 8.8% client
Paging: size 4GB 100% free
currently the system is fine, we only have 188 running processes and there are no zombies.
There are no obvious alerts in the errpt that can explain this behavior.
The DNS is always responsive. (we experience this kind of problem only on AIX machines with DB2 installed)
free space in /, /var and /tmp is between 35-50% (each of them is 1GB)
Thanks,
Tal
The load is very low: (during system slowness)
CPU: is above 95% idle on both CPU’s.
Network: usually less than 10kbps load.
Disks: less than 5% busy most of the time.
Memory: from 16GB 27.7% comp, 8.8% noncomp and 8.8% client
Paging: size 4GB 100% free
currently the system is fine, we only have 188 running processes and there are no zombies.
There are no obvious alerts in the errpt that can explain this behavior.
The DNS is always responsive. (we experience this kind of problem only on AIX machines with DB2 installed)
free space in /, /var and /tmp is between 35-50% (each of them is 1GB)
Thanks,
Tal
I meant the "load" as displayed by "uptime" or "w".
ASKER
bash-3.2# w
04:57PM up 2 days, 2:04, 14 users, load average: 1.56, 1.73, 1.95
04:57PM up 2 days, 2:04, 14 users, load average: 1.56, 1.73, 1.95
ASKER
woolmilkporc?
Sorry for the delay,
we're just doing lots of migrations to new hardware.
Your load seems tolerable for a 2-way machine, although it's not really "low".
Or is this a 1-way machine with 2 logical CPUs due to SMT?
What do you see with "smtctl"?
I'd really suspect a memory problem, but if you're sure that there is no paging ...
What do you see with topas on the half right under "pgspin" and "pgspout"?
Are you sure that EXTSHM is not set?
Issue "echo $EXTSHM" as the user who starts DB2!
wmp
we're just doing lots of migrations to new hardware.
Your load seems tolerable for a 2-way machine, although it's not really "low".
Or is this a 1-way machine with 2 logical CPUs due to SMT?
What do you see with "smtctl"?
I'd really suspect a memory problem, but if you're sure that there is no paging ...
What do you see with topas on the half right under "pgspin" and "pgspout"?
Are you sure that EXTSHM is not set?
Issue "echo $EXTSHM" as the user who starts DB2!
wmp
ASKER
bash-4.0# smtctl
This system is SMT capable.
This system supports up to 2 SMT threads per processor.
SMT is currently enabled.
SMT boot mode is not set.
SMT threads are bound to the same physical processor.
proc0 has 2 SMT threads.
Bind processor 0 is bound with proc0
Bind processor 1 is bound with proc0
proc2 has 2 SMT threads.
Bind processor 2 is bound with proc2
Bind processor 3 is bound with proc2
"pgspin" and "pgspout" = 0
I run "echo $EXTSHM" with the DB2 user and there was no output - blank line.
This system is SMT capable.
This system supports up to 2 SMT threads per processor.
SMT is currently enabled.
SMT boot mode is not set.
SMT threads are bound to the same physical processor.
proc0 has 2 SMT threads.
Bind processor 0 is bound with proc0
Bind processor 1 is bound with proc0
proc2 has 2 SMT threads.
Bind processor 2 is bound with proc2
Bind processor 3 is bound with proc2
"pgspin" and "pgspout" = 0
I run "echo $EXTSHM" with the DB2 user and there was no output - blank line.
It seems that your current observations are from a responsive system in good state.
I fear we will have to wait until the described issues happen, to then check again for paging etc.
I fear we will have to wait until the described issues happen, to then check again for paging etc.
ASKER
The load & paging values are the same during system slowness...
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Hi wmp,
I added 20GB to the paging space, hopefully it will solve the problem...
Thanks!
Tal
I added 20GB to the paging space, hopefully it will solve the problem...
Thanks!
Tal
Until you describe what is slow, there is little anyone can do for you.