Unbearably slow hard drive speed on IBM X3650 M2

The server is X3650 M2 with dual quad-core CPUs, 20GB memory, and 4 10k SAS drives in RAID 10 running Win2008 64bit on a IBM ServeRAID M1015 SAS RAID Controller.

I did a Bart's Stuff Test. The hard drive only runs 35MB/s!!!??? Any of my home computers runs faster than that!!!

What could be wrong? The only thing I can think of is that when I installed the RAID controller, I took out the old onboard LSI RAID card and put the controller in that spot.

Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

35MB/sec could very well be NORMAL, based on the I/O block size of your benchmark, and the I/O block size of your controller

Example, if you set up block size of 512KB on the RAID, and took default of 4KB on NTFS, then anytime you write anything, then your controller is obligated to write 256MB.

Also no matter what you did NOT get 35MB/sec, you actually wrote at a minimum 70MB/sec, more likely 140MB/sec.   (Due to the redundancy & striping at the physical disk level)

To speed up performance tune the RAID stripe size to match the NTFS allocation size, and google "aligned NTFS I/Os" to see how to use diskpart to perform this magic.    but it will require a reload of the O/S.

P.S.  A quick thing you can do, is just enable write cache if it is not enabled..  I do not recommend this unless you perform frequent backups and have a UPS with batter backup.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
PaperTigerAuthor Commented:
The block size is 256K. I run the same test on my current production database server, a much older model of X3650 with RAID5 setup, it goes 160MB/s!!! My home computer runs at 60MB/s with no RAID.

Keep in mind this machine has RAID 10, which should be at least twice as fast as the RAID5.
Powerful Yet Easy-to-Use Network Monitoring

Identify excessive bandwidth utilization or unexpected application traffic with SolarWinds Bandwidth Analyzer Pack.

RAID 10 is not X2 faster than RAID5. That is an incorrect generalization.   First and foremost you have 2 inversely proportional metrics for performance.  IOPS and throughput.  Reading and writing on RAID5 and even RAID10 have different performance characteristics depending on the rid algorithms and raid level.   Even with just RAID5 and sequential READS, both throughput and IOPS are a function of the # of disks, even if everything else is equal.   Then even if you keep all aspects of the RAID the same, if you change NTFS allocation size, or do aligned vs unaligned I/Os, then both throughput and IOPS can change dramatically.  

I assume all you care about is throughput ... but are you considering things like NTFS logging?  If you write something to a file, NTFS logs that and creates another write for journaling purposes.  

Look at perfmon, the I/O section, and see what it reports.  

Another thing, how do you know if you don't have a bunch of bad blocks on one of the disks?  So don't trust hardware until you verify it.  Do some deep disk I/O diagnostics (which can not be done with a RAID controller).    LOok at  the event logs in the RAID to see if anything is out of the ordinary.

Finally, if you are doing SQL server mostly, then the block size is natively 64KB.  You will get best overall performance in that environment with NTFS =64KB, and stripe size on RAID is set so that each disk reads/writes 32KB at a time.     If you are in a high THROUGHPUT SQL environment, which is rather rare, then double RAID stripe size.
PaperTigerAuthor Commented:
How can you explain that my SATA I drive runs faster than this machine?

I did an Oracle database import both on a Dell Dual CPU (single core) server with a single SATA I drive and on this brand new IBM X3650, the import on the crappy Dell finished in 1/3 of the time, 2 hours vs. 6!!! On my production database server, an 4 year old X3650 mentioned above with 160MB/s sequential writes, the same import is done in 20 minutes!!!

This is not right somewhere.

You have not established that the SATA drive "runs faster". You have merely established that this particular LOGICAL drive yields X MB/sec THROUGHPUT for a specific benchmark.  In fact, you haven't established anything at all related to a hard drive.   The performance is mostly affected by how you configure the RAID controller.

Let's say that your benchmark has been running for a few seconds, so there is now longer any cached I/Os lurking in any of the physical disk or RAID controllers I/O buffers.

If you run the write benchmark for 1 second, then you will write at a minimum 70 MB worth of data, not 35MB.  It has do , so you are getting 70MB/sec, are you not, ignoring everything else you have RAID1, so you write data twice.  That is total of 70MB.

Now, here is where you need to do a thought experiment.  Your defaults in NTFS are 4KB.  When you write something, even if you turn off all journaling, and do nothing more than create a single-byte file, then how much data gets written total?  Block size is 256KB, so it has to write a single block, that is what it is there for.  Since you are mirrored, then it writes it twice.  Now you are up to 512.   Now, double that just because creating a file requires at least something to get written to the directory.  But wait, it needs to read the directory also,

So in this case, even a highly tuned machine with a tiny directory that can fit in just 256KB, writes 1MB worth of data total for just a few byte-long file.  Now think of all the log files and how windows updates even FILE ACCESS time/date stamps when your type out a file.   All those 512KBs of data getting written if you so much as touch a file.

If you were doing video streaming, then large block sizes make sense .. but never on the boot disk.

now with a stand-alone HDD, block size=I/O size.  If you ask for 4KB from NTFS, you get 4KB, If you write 4KB, you write 4KB not 64 times that.

See? Block size is VERY important.  Make it smaller!

PaperTigerAuthor Commented:
You may want to take a look at the Bart's Stuff Test. It runs for hours to test both read and write.

The results are consistent with the database import and export.
No need to look at Bart's stuff. It is not RAID-aware.  It is not reporting the I/O performed by the physical drives.  Re-read #29722524.   If you want better performance, set up a pair of RAID1, I would go with 64KB on one (put the SQL and exchange there), and go with whatever the minimum size is for your controller for the C: drive.    If you insist on doing RAID10, then at least drop block size to 64KB.

In all cases, use diskpart to align I/Os.  
It is not possible to give specifics on an optimal configuration without a heck of a lot of details, and only then it would be an educated guess.  However, if you do what I say, then I have no doubt that this should give you at bare minimum 2X better performance, more likely 4X.

PaperTigerAuthor Commented:
I appreciate the information. I think you may be thinking in a wrong direction. Playing with block size, well, as I read, may only give you 10% of increase.

We are not talking about 10%. We are talking about a MUCH better computer that is 4 times slower than its 4 years' old predecessor that has a much slower RAID configuration (RAID 5), AND slower than any other computer I have - over 20 of them. All of them are configured with simple default install.

There's something fundamentally wrong!
PaperTigerAuthor Commented:
dlethe, I take my comment back because I was wrong.

Your WRITE BACK trick made a HUGE difference. Now the sequential write jumps from 33MB to 250MB/s, not as great as what I expected, but now the Oracle import can finish within 1 hour instead of 5.

I would like to further understand how to fine tune the stripe size. I did a bit of research and it's suggested to have 1024K RAID stripe size for Oracle database. Any suggestion on how to do it? Can I just set that in the RAID card, then dump my server image back?
HI Paper -
Nailing the stripe size for a particular file system/RAID topology/RAID controller along with the file system settings (disk mode pages, and all the other things that can still be good for +/- maybe 50% is just not possible w/o a great degree of work, and even then it is all relative to the workload.   There is just no substitute for running something like the Import and timing it, tweaking, and retrying.  

I do not have nearly enough information but I would look at trying 2 things, which depend on the ratio of large to small block I/O, the read/write ratios, and where your system spends time.

1) Go with RAID10, 64KB stripe size, make NTFS match, and use aligned I/Os.
2) Go with 2 x RAID1.  This would be for a system that has more balanced mix of needing IOs per sec, and lots of throughput at times..   I would make C: a RAID1, with 16KB everywhere, put your most random, smaller I/Os here, scratch table space, anything created on the fly.  Then make D:\ a high throughput logical drive, use 64KB - 256KB.

In both cases, understand that these are gross generalizations.  Really gross.  Even if you nail stripe sizes, you then need to balance I/Os and move things around so the logical disk that is tuned for higher IOPS and lower throughput actually services data files which match the optimal config, and vice-versa.

Remember, no free lunch.  With large I/Os like what you mentioned, 1024K then it would effectively kill your computer whenever it had to just write a lousy 2KB worth of data. You force it to write 1024KB of data.  But if you want to write 10MB, then you can do it much more quickly writing 1MB at a time vs 2KB at a time.  

Sorry about round-about answer, there are just too many unknowns to give you much better an answer.  

P.S. because write cache enable made such a huge increase, then this indicates that the host O/S is sending very small I/O requests to the RAID controller.  WCE allows it to aggregate and rearrange I/O requests into fewer, larger, I/O requests.

By the time the I/O request gets to the controller, either the application, file system, drivers, or some combination is sending a lot of small I/O requests.. Each I/O request has overhead beyond what it takes to service it.  

As such, you will get further benefit of redoing NTFS for larger size, as described above. It is not possible to give you much more than 16-64KB should be an improvement.
PaperTigerAuthor Commented:
Thank you for the information. I am hesitant to try now as this is now becoming our prod server. I'll test these ideas when I get another one in. Thank you again!
There is no memory cache on M1015 card so only write through is available. Also I get up 190MB/s Read and 85MB/s write using RAID10 and 7200RPM Sata2 drives.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Server Hardware

From novice to tech pro — start learning today.