Hi,
You can use iostat tool. Please see http://www.linuxcommand.or
Main Topics
Browse All TopicsThe top command on our RHEL4 server shows our CPU in WA (wait) state for about 50% of the time. I guess this is caused by one of several NFS mounted volumes. How do I determine exactly which device and which of the user processes causing this wait state? BTW, why CPU should be waiting at all in an interrupt driven system?
Vinod
This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.
Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.
If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.
Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.
Access the answers to your technology questions today.
30-day free trial. Register in 60 seconds.
Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Try it out and discover for yourself.
30-day free trial. Register in 60 seconds.
Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.
Hi,
You can use iostat tool. Please see http://www.linuxcommand.or
Run
iostat -x 5
And let it run for a little while, checking the the last three columns:
await
The average time (in milliseconds) for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them.
svctm
The average service time (in milliseconds) for I/O requests that were issued to the device.
%util
Percentage of CPU time during which I/O requests were issued to the device (bandwidth utilization for the device). Device saturation occurs when this value is close to 100%.
What should I make out of the following output? It does not say anything about NFS volumes. Does not tell me which PID is causing the near 100% iowait.
# iostat -x 5
Linux 2.6.9-55.0.6.ELsmp (myhost.princeton.edu) 2007-10-04
avg-cpu: %user %nice %sys %iowait %idle
0.25 0.00 0.40 99.35 0.00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 1.80 0.00 0.60 0.00 19.16 0.00 9.58 32.00 0.00 1.33 1.33 0.08
avg-cpu: %user %nice %sys %iowait %idle
0.80 0.00 1.40 92.65 5.15
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 1.40 0.00 35.33 0.00 461.48 0.00 230.74 13.06 2.32 7.69 0.64 2.26
Hi,
For NFS mounted shares, please use nfsstat tool.
See the output of:
nfsstat -c -n -m
The link below is the man page for nfsstat:
http://linux.die.net/man/8
nfsstat does not explain nearly 100% iowait. Niether iostat nor nfsstat gives any hint about which process is to blame to which I could send a kill. Our cluster has multiple NFS servers and multiple client nodes. All NFS volumes are mounted on all clients. When any NFS client/server is stuck, it affects all users on our mail login server which is only an NFS client. Simple commands like df hang for ever, ctrl-c does not work, %iowait goes 100%, all users start screeming. I need to put the system is usable state without rebooting.
Vinod
Business Accounts
Answer for Membership
by: ravenplPosted on 2007-10-02 at 23:12:59ID: 20004277
I don't know the answer for the first Q
> BTW, why CPU should be waiting at all in an interrupt driven system?
OS is in wait state if there is pending I/O (disk/nfs access etc.) and no other task to schedule. In other words there is nothing to do(cpu is idle), cause we have to wait for I/O completion first.