Link to home
Start Free TrialLog in
Avatar of charvett
charvettFlag for United States of America

asked on

Dell PERC 6E w/ MD1120 - RAID 10 - How many physical disks per span?

Hello all,

I'm trying to figure out the best configuration for performance to create a single RAID 10 virtual disk across 24 physical disks.

It seems that I can span as few as 4 and as many as 12 physical disks per span. With 4 per span I would have 6 spans total, with 12 I would have 2 spans. But no where in the documentation have I seen notes related to performance or any related trade-offs.

Does anyone have any thoughts or point me to a resource that discusses best practices for determining the number of physical disks per span?

Thanks! User generated image
Avatar of David
David
Flag of United States of America image

The "BEST" configuration is the one that generates the least amount of I/O.   a 4 disk RAID10 has profoundly different performance characteristics of a 12-drive RAID10, which has different characteristics of 3 x 4-drive RAID10s.

For any host-generated I/O of X blocks, then you will get a different number of I/Os per second as  well as throughput in the controller depending on the stripe size in the controller, the number of disks, and type of disk drive.   SAS and SATA have different characteristics as well. (Plus number of threads, is it a read, or write, random or sequential)

So the "BEST" answer is to measure how you are currently doing storage.  In grand scheme of things a VM pretty much does random I/O of 64KB sized blocks.   So I would benchmark several configurations based on that.

My general answer, is that 3x4-disk RAID10s will provide much better performance overall then a single 12-disk.
ASKER CERTIFIED SOLUTION
Avatar of Gerald Connolly
Gerald Connolly
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of charvett

ASKER

Hi guys,

Thanks for the responses!

Just to clarify - my goal is to create a single RAID 10 virtual disk from all 24 physical disks. My database file is just over a TB in size.

As conolly pointed out, and was my understanding up until I started working with dell PERC controllers, the RAID10 configuration is striped sets of mirrors. So you would expect 12 spans (mirrored sets) of 2 drives each.

However, with the PERC controller it creates RAID 1 spans. Think of them as chains of mirrored drives - kind of confusing. You can see it in the image I attached. So these spans can be as few as 4 physical drives or as many as 12 physical drives. Then you stripe across the two.

Anyways, I think I dounf my answer in the Dell documentation. It states that the controller when set to 'Intelligent Mirroring' will try to produce spans with the fewest drives for the best IO.

So in my case it would be 4 drives per span with a total of 6 spans = 24 physical drives.

Thanks again for you replies.
Avatar of MrVault
MrVault

did you figure this out? we had a raid 1 set that doubled it's performance when we remade it as a 4 disk raid 10 set. we wanted further performance so we added 2 more drives to make a 6 drive raid 10 set, with a 2 disk span setting. the performance did not change at all which I thought was very odd. I don't really get this spanning thing.
"performance" is relative.  First, the two main measurements are throughput and I/Os per second.  For any given I/O size, then their relationship is inversely proportional.  Both the RAID controller and file system block sizes, along with the controller chunk sizes; and the nature of the I/O (random vs sequential vs read vs write) have different performance characteristics.

You just can't add more disks to ANY given raid set and expect overall performance to improve for all I/O loads.  In general, a large multi-drive RAID10 is a poor architecture once you get beyond 4 disks total.  

Also one can squeeze additional "performance" out of a 2-disk RAID1 by repartitioning into 2 luns then striping with the O/S, as an example, i.e, making a 2-disk RAID10. (But one can't do this with just any hardware config).  

the activity on the volume stayed the same and our measurements were over the same period (we track this every day over a given time frame). they also have the same stripe size of 64K and allocation/cluster size of 64K.

can you explain why 2 raid 10 4-disk sets would perform better than 1 raid 10 8-disk set assuming the same files were on each?

I also still don't get the whole disk spanning question when you're creating a virtual disk on the perc controller. for a 6 disk raid 10 set (assuming you needed the space), would you choose 2 disk spans (so there'd be 3 spans total) or would you choose 3 disk span (making 2 spans total)?
MrVault,

It was at first a bit confusing to me which disk span configuration to choose. Rather than blither on pointlessly about IO or throughput, I will tell you that I've found that fewer disks within a span works better.

So in your example, I would configure it to two disks per span for a total of three spans.

I know what it's like to just want a straight forward answer, and in this case I've found it through trial and error.

Good Luck!

"can you explain why 2 raid 10 4-disk sets would perform better than 1 raid 10 8-disk set assuming the same files were on each?"

Well first, I am taking the position that this is server-type transactional/database traffic instead of video streaming.

In server-type loads IOPs are critical.  Your question indicates you are under the assumption that spreading a file across 8 disks is better than four.  It isn't.   The correct thing to do would be to distribute the files so that half of them (assuming balanced I/O) is on one 4-disk set, the other half are serviced by the other disks.

With your way of doing things, then any large file is going to require all disks.  This creates a big bottleneck.  Get a bad block, and EVERYBODY waits. Every I/O on the computer is effectively it's own bottleneck.   Every I/O has to wait for the previous I/O to complete once there is just a single I/O in the queue, with will absolutely be the norm.

So lets say that real-world, your disks can do 40 I/Os per second.  If all of the files are split up across 8 disks, then effectively you can only handle 40 files at any one time,  The data will be spread out so much then you won't get any cache hits either, so you'll end up doing random I/O.

Now, if you split ....

You have 2 of these 40 I/Os second queues.  That is double the number of files.   Not only that, but since files are split less, then you have greater possibility of getting some sequential I/O which will be cached.    You are more than doubling the amount of work the system can do.

A BETTER alternative, would also be to throw in a RAID1.  Use that for write intensive things such as journaling and building scratch data tables.    Set the block size to absolute minimum, and you could very well have that 2-disk RAID1 outperform a 4-disk RAID10 in some types of I/O.

Here is a trick that will really blow your mind, but you can't do it on the PERC .... But if you wanted more throughput then you could partition a 2-disk RAID1 into a 2-DISK RAID10.  Depending on some variables, you could end up seeing a nice 25% throughput improvement :)
we're talking about index files in the hundreds of gigs, not tons of small files.

are you suggesting that I create span sizes that are half the number of disks in the raid set? so for a 4 disk set, use span size 2 and 3 for a 6 disk set?

To use an index file efficiently, you will be doing random I/O.   Assuming SQL server, then the native I/O size is 64KB.  So you must optimize the RAID config so that whenever you WRITE data, no more or no less than 64KB worth of data is written to each physical disk in the mirrored pair.

I have no idea how your RAID is set up at this level. You'll have to research it.  I've seen people screw this up so much that every I/O results in well over 1MB of data being copied instead of just 64KB, so they are getting about 1/7th of the write performance they should be getting.  If you are running Win2k3 instead of Wn2K8 (Excuse me if I assume windows instead of a UNIX variant, just playing the odds).   Then your partition must be aligned on a 64KB or whatever other block size you have set on the RAID, or you will end up doing twice what you need to do.

Large index files are still very much random I/O, so you are always going to be better off using a RAID setup which will provide high IOPs with a probability of getting some cached I/O.  This means putting individual files on RAID1 UNTIL you get the point where you see I/Os having to wait in a queue.  Then, best practice is to break up the index file into 2 separate files and give it another RAID1.

The more traditional RAID10 is not as good a match because the host computer won't be issuing 128KB I/O write or read requests, so there will be no throughput advantage that you might see with other types of data.

we're using windows server 2008 r2 x64 enterprise. the offset issue is not there. we were at raid1 and performance doubled when we went to raid10 4 disks. when we went to 6 disk raid 10 (2 disk span) performance stayed the same. the stripe size is 64k and the allocation/cluster size in windows is 64k.

there are 3 indexes around 100GB each. so breaking each up into smaller files and giving their own raid1 sets isn't really possible, though each index is it's own file already.

maybe doing 3 raid1 sets is better?
You can run perfmon to tell you specifics, but if queue depth is always 1 or 0, then no reason to change for IOPs reasons.  If, however, queue depth os consistently higher, then you should absolutely break up those stripes, provided you can distribute the index files so that they can be balanced between the 2 RAID1s.

As for the 6 disk span not helping,  it generally will only hurt.   Most I/O is going to be issued in powers of 2.  So if you think about it, how often will you have an I/O request that is a multiple of 6?  Pretty much never, so a 6-disk RAID10 can't possibly ever help much of anything except large-block sequential.

(To an even greater extent, you'll see this problem even more with RAID5.   Never build a 4 or 8 drive RAID5, as this guarantees you leave performance on the table.   Do a 5 or 9 disk.  Effects will become more noticeable as you add concurrent I/O requests, and with non-premium, under-cached controllers.
by queue depth do you mean disk queue length?

so are you suggesting we won't see a performance increase until we go to 8 disks instead of 6?

we're using Average Disk Seconds/read(write, transfer) to gauge performance.
yes queue depth.
As for what you will see, then way to many variables to guarantee anything, but in general, you need to just understand that large RAID10 stripe sets are not appropriate for database if you are seeing I/Os that are queued up and the reason is NOT bus saturation.    Combine with the multiple-of-2 rule, then you probably won't see that much of a difference.  If you're going to redo it, then if you can get away with it, go with either a 2 x RAID10  or 1 x RAID10 + 2 x RAID1

The 1 x RAID10 + 2 x RAID1 is best for databases that are getting pounded with queries and reindexing, like a decision support system or expert system where you are constantly adding sophisticated queries and have huge scratch tables.  The 2 x RAID10 as a better overall solution for more general I/O
we only have 6 disks to work with, so we have to either do raid1 + raid10, 3xraid1, or 1 raid10
Then most likely the 1 x RAID10 + 1 x RAID1 will be better for you.  
that's the setup we had but it wasn't getting enough performance. would  3 x raid1 not be better, giving each high IO index it's own raid 1 set rather than all three indexes on 1 raid10 set?
You need to step back and look at the specifics of the RAID config as set up in the controller.  It all starts with that, and you haven't identified any bottleneck.  How is the controller set up?  Block size, write cache (at the RAID controller; the individual disk drives; and within windows O/S itself).  There are 3 different write caches.  

With 8 disks, then bottleneck could very well be the card itself and the slot in the mobo.  You have enough storage to saturate unless you have enough lanes in a PCIe slot. it could be nothing more than plugging the same card into a PCIe x 8 or higher slot.
64k block size. no read ahead. write back. 1gb non volatile cache. no disk cache. nothing in windows. 64k cluster/allocation size

it's an R510 with 12 x 3.5" disks 600 GB 6Gbps 15K sas disks.
turn on read-ahead. I don't have the architecture docs on the R510, so you will have to do some homework.  You need to find out how that particular controller handles block sizes on the hardware level.

If the block size is 64KB, and you have a 6-disk RAID1, does that mean that a write will ALWAYS result in (64KB x 3) x 2 stripes being written?    On reads, some controllers will make 3 disks read 64KB each, even though you want just 64KB.    Others, like the fakerRaid controllers will actually make you read 64KB from all 6 disks.   Others will read just 64 KB from the mirror that has that particular block.

Still others, the best ones will only make one disk read 64KB, and then the I/O will go to the disk that can satisfy the I/O the quickest, and will load balance.

For writes, same issue, some controllers will write 192KB worth of data x 2 stripes no matter what, but best case they will write 64KB x 2 disks.

Now you understand the importance and ramifications at a better level.  there is much more here I don't care to get into like how the I/O queue and reordering and cache buffers play into this along with the RAID5 write hole, but research this and it will tell you why your implementation is most likely incorrect.

we saw degraded performance when enabling read-ahead actually.