top command in Sun Fire v880 Server


I am facing a serious with Sun Fire v880 server..

When I issue the command  top this is a brief o/p of it..

As u can see the iowait is 98.6 % and it flucuates around this range..
Due to which users logged in cannot start huge processes..
To brief you this Sun Server is attached to  EVA SAN Storage..

load averages:  0.03,  0.04,  0.06                                                                                                 17:17:45
368 processes: 367 sleeping, 1 on cpu
CPU states:  0.0% idle,  0.1% user,  1.3% kernel, 98.6% iowait,  0.0% swap
Memory: 32G real, 23G free, 3784M swap in use, 26G swap free

  3576 abhishek   1  48    0 4224K 3304K sleep    0:03  0.14% tcsh
 24191 pgnsst1    1  58    0 1824K 1496K sleep    0:42  0.04% top
  2843 root       1  58    0 1768K 1400K cpu/0    0:02  0.04% top
 15990 oracle     1  58    0    0K    0K sleep    0:15  0.02% oracle
 20826 oracle     1  58    0    0K    0K sleep    0:18  0.01% oracle
  1961 root       1  58    0 6872K 4816K sleep    0:00  0.01% dtterm
 15749 oracle     1  58    0    0K    0K sleep    0:15  0.01% oracle
  3575 abhishek   1  58    0 5176K 4176K sleep    0:00  0.01% xterm

Here there any bug or any hardware problem ..that i can trace for...

please provide few suggestions regarding this ...
GugroConnect With a Mentor Commented:
start   iostat -xzn 10  and look which disks are busy. The system is waiting for I/O.
Are there 'lots' of I/O or is the system only waiting for hanging I/Os on the disk subsystem.
rugdogConnect With a Mentor Commented:

1. /var/adm/messages and oracle logs, for hints on IO problems
2. I see you are using oracle, which version?  I just recently saw a problem with Oracdel 8.1.7 and the SAN subsystem where there was a lot of IO wait. The symptoms and suggestions for fixing it can be found in the oracle metalink web site:

Oracle Metalink article:
Subject: Warning "aiowait timed out 1 times" in alert.log
Lots of questions:
Lots of questions:

How many CPUs ?

How much memory ?

How much storage ?

How many HBAs ?

Whose SAN software you using (I'm not a HP expert so not sure of SAN Foundation Suite is supported here) ?  Are all HBA paths active ?

How many LUNs are exposed off the EVA SAN stoarge ?

What else besides Oracle is running on the box ?

How large is the Oracle SGA ?

Have you run "iostat -x 5" for a period of time to see what LUNs are being hit ?

Are you using SVM, VxVM, or hardware based RAID ?  What RAID level(s) ?

If hardware based RAID, are you using still using SVM or VxVM to setup a plaid RAID configuration by putting up simple RAID 0 stripping all of the exposed LUNs together ?  What is the interlace size ?  Is this interface size tuned to effectively support the backend EVA RAID strip configuration ?

Are you using cooked or RAW file systems with Oracle ?

If cooked, are you using VxFS, QFS, or UFS ?

If VxFS or UFS, are you using either file system's varient of force direct IO ?

If UFS, is maxcontig tuned to effectively support the backend EVA RAID strip configuration ?
Oh ... forgot to ask:

32 bit Oracle or 64 bit Oracle ?

Which verison of Oracle ?
It looks like a case, where someone is making a copy from and to, same partition.
You probably got inferior hard-drive or controller.
