A "99% io wait" is a very normal behaviour for an application where IO is the bottleneck.
And I think, this is your case:
- The best possible case is when your OS can keep the read file in its cache and all the write can be rendered sequential by a write back cache on the raid hba : This best case allows to get a 200MB/s throughput (assuming the CPU is almost instantaneous a task) on a file by file conversion basis
- The worst case is when everything is using the raid array and you have parallel conversion tasks because your array can not render the write work as a sequential one (thus getting a poor 20MB/s throughput)
Check my guesses:
- All folders (OS, /tmp, ~home, ...) are on the 6 drives raid 5 array
- The partition was not aligned to a physical stripe size multiple (Partition requires a small offset to store its own info; this offset needs to be a stripe size multiple; a 64KB stripe size - the size of 1 array block on 1 HDD - requires a 64/128/192/256/... offset to be "aligned" with the array)
- The raid HBA does not have a write back cache backed by a battery
- The convert command has been configured to by a highly parallel task
Some possible actions are:
- If you have a parameter to force the raid hba cache to be "write" dedicated : do it !
- if you can move your /tmp folder elsewhere than the array : do it !
- If you can lower the parallel setting of your "convert" command : do it !
- Next time : go to RAID 10 as other parity raids (5/50/6/60) are just NOT suitable for most io usage ! You may "loose" some capacity but do not have to worry about its performance any more !
- Next time : alignement cost almost nothing at partition time





by: forriePosted on 2009-11-02 at 13:03:17ID: 25723696
The virtual host OS is CentOS 5.4; current.