Solved

What RAID config should i use for DBs

Posted on 2007-11-16
38
853 Views
Last Modified: 2011-10-19
Hi,

I'm just about to build out a new SQL server that will host a number of databases that have high I/O requests.  Normally i would build this out as follows:

OS: RAID 1 using onboard discs and controller.
SQL Server + SQL Logs: RAID 1 using a second pair of onboard discs and same controller as OS.
SQL Server DBs: RAID 5 across 13 discs of an external storage array with dedicated controller.

I have the option of creating a RAID 10 set using 12 of the 13 discs of the external array rather than RAID 5.  RAID 10 generally gives better I/O performance over RAID 5 but will the two sets of 6 discs mirrored in RAID 10 give better I/O than 13 discs in RAID 5?

For my calculations i'm also using the (n-1)*0 equation where n=number of discs and 0=maximum I/O throughput per disc.

I'm still leaning towards the RAID 5 but thats because i don't have a great level of experience with RAID 10 arrays and their I/O throughput.

Thanks in advance.

Matt

 
0
Comment
Question by:TheNumberUK
  • 12
  • 10
  • 10
  • +4
38 Comments
 
LVL 31

Expert Comment

by:James Murrell
ID: 20298796
0
 
LVL 6

Expert Comment

by:nathana21
ID: 20298802
I would stick  to raid 5, but look at this >> http://www.experts-exchange.com/Storage/Q_20640972.html

0
 
LVL 55

Expert Comment

by:andyalder
ID: 20298820
>RAID 10 generally gives better I/O performance over RAID 5 but will the two sets of 6 discs mirrored in RAID 10 give better I/O than 13 discs in RAID 5?

Yes, especially with write since RAID 5 needs to do twice as many physical I/Os for a logical write than RAID 10 needs. Also with a disk failed RAID 10 knocks the spots off RAID 5 since it just reads the other disk in the mirror rather than reading all the disks in the array to calculate data from parity. Also rebuild after a single disk failure in RAID 10 takes minutes whereas with RAID 5 it takes hours. Plus you can use the 13th disk as a hot spare.
0
 
LVL 42

Expert Comment

by:EugeneZ
ID: 20298931
50/50 read-write?

check:
 RAID Levels and SQL Server
http://msdn2.microsoft.com/en-us/library/ms190764.aspx
0
 

Author Comment

by:TheNumberUK
ID: 20298991
Thanks guys.

I'm just going to pull up the disc manufacturers max I/O stats for the discs and do the math to see what the average I/O figures are before i decide which way to go.  As it stands at present i tend to agree with Andyalder that the RAID 10 is going to be quicker for the writes.  But to throw in a curve ball, all our data writes are done out of hours and its data reads during the working day that are our key driver.  It will be interesting to see the outcome of the math.  When i have the figures to calculate  i'll display on here and see if you get the same result as and conclusion as me.

I need to be able to mathmatically prove which is going to best before i start.

cheers for the quick response.

Matt
0
 
LVL 21

Expert Comment

by:mastoo
ID: 20299144
It's moot if you don't worry about write performance during normal hours but andyalder made an important point.  You can't just compare sustained throughput rates (or similar) since raid 5 does more i/o for a given write operation.
0
 

Author Comment

by:TheNumberUK
ID: 20299220
So what are the Read benefits with RAID 10 over RAID 5 or are they similar?

0
 
LVL 69

Expert Comment

by:ScottPletcher
ID: 20299255
RAID 5 should give you better read performance.
0
 

Author Comment

by:TheNumberUK
ID: 20299370
What if i was to change the read\write ratio in favour of read I/Os on the external array controller and use RAID 10?
0
 
LVL 55

Expert Comment

by:andyalder
ID: 20299408
You don't do N-1 for RAID 5 read but N same as for RAID 10 so your RAID 5 will be 13/12 times as fast as the RAID 10 for purely read as long as all the disks work.

As soon as one fails you are doing a single I/O from 1 of 11 disks for a single read in RAID 10 but 12 I/Os from your 13 disk RAID 5 array if the data is supposed to be on the failed disk. I've seen systems that are so unresopnsive during RAID 5 rebuild that the users have been told not to use the system until the rebuild is complete.
0
 

Author Comment

by:TheNumberUK
ID: 20299456
I agree with your thoughts entirely andyalder.  The business will have to decide what they deem more important, IO throughput or resilence.  I'm just going to present them the stats.  As we currently have online failover systems for the database we are in a position were we can run with one server and array completely out of action whilst its rebuilding a broken array.  My guess is they are going to opt for performance.  
0
 
LVL 55

Expert Comment

by:andyalder
ID: 20303870
Why 13 disks? Most 19" enclosures take 14 disks. Comparing Raid 10 Vs Raid 5 100% read with equal numbers of disks is a better comparison, they should be exactly the same speed as long as stripe size is also the same. RAID 10 may have a slight edge as the controller has a choice of which disk in a mirror to get the data from.
0
 

Author Comment

by:TheNumberUK
ID: 20311246
We recently had a SAN installed and the IBM consultant that configured it kept going on about if you use RAID 5 then you should always use an odd number of discs so once the parity bit is taken out there is an even number of remaining discs to service the requests.  Apparantly this aids performance, i have no reason not to believe him and its not something i've come across before.

Does your comment 'You don't do N-1 for RAID 5 read but N same as for RAID 10 so your RAID 5 will be 13/12 times as fast as the RAID 10 for purely read as long as all the disks work.' not go against what you have just said above?

Would the RAID 5 stripe be able to handle more requests as the data is over 12 discs rather than 6 mirrored, the actual speed of processing the requests should be the same using you example above but would the RAID 10 array start to queue requests sooner once all 6 mirrors hare handling requests?

 
0
 
LVL 55

Expert Comment

by:andyalder
ID: 20311306
I have never heard of having an odd number of disks in RAID 5 being superior to have an even number of disks, I would ask the IBM bloke for something to back that up.

13/12 is simply because you had 1 more disk in your RAID 5 array. I can't see how it contradicts.

A decent RAID controller will do balanced reads from RAID 10 inder high I/O demand; half the I/Os will be directed at one disk in a mirror and half to the other disk. Under low I/O demand it may send the read request to both disks and take the result from whichever comes back faster.

What controller are you using?

0
 

Author Comment

by:TheNumberUK
ID: 20312230
ok, that's what i wasn't sure about, whether the controller would balnace the reads across the mirror.  I'm going to use RAID 10 and set the read\write ratio in favour of read. I will then benchmark it against an identical RAID 5 array.  I believe that the RAID 10 will give us the better performance and also the obvious improved resilience.

As soon as the testing is done i'll publish the results on here to see which performed best and award the point accordingly.  Thanks for all your suggestions\answers it has been very helpful.
0
 
LVL 69

Accepted Solution

by:
ScottPletcher earned 125 total points
ID: 20312680
>> Comparing Raid 10 Vs Raid 5 100% read with equal numbers of disks is a better comparison <<

But that ignores the reality of RAID 10 vs RAID 5.  RAID 10 will always end up with (nearly) have the RAID 5 drives based on a given physical number of drives.  Those extra spindles *speed up reads*.  So, for a mostly read-only environment, RAID 5 *will* be faster, whether the RAID 10 controller auto-load-balances or not.
0
 
LVL 55

Assisted Solution

by:andyalder
andyalder earned 125 total points
ID: 20312795
>RAID 10 will always end up with (nearly) have the RAID 5 drives based on a given physical number of drives.

Could you repeat that in English please Scotty, I'm not being rude but it doesn't make sense gramatically, I'm not sure what you're trying to say.
0
 
LVL 69

Expert Comment

by:ScottPletcher
ID: 20312842
Sorry, I re-wrote the sentence and didn't properly check it afterward.

Based on the same number of physical starting drives, RAID5 will always end up with nearly double the number of spindles responding to a single read request.  By definition, RAID10 only uses half the spindles to do a single read, whereas R5 uses them all.  Therefore, R5 read performance should always be better than R10.  

R5, though, is of course much worse for updates, roughly double the i/o per update of R10.
0
Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 55

Expert Comment

by:andyalder
ID: 20313191
I see what you mean, but RAID controllers do not work that way unless they are badly setup.

If you set the database block size to 64K and the stripe size* to 64K and the NTFS allocation unit size to 64K then a single disk can satisfy a single I/O as long as you align the NTFS partition with diskpar(t).

* I'm using "stripe size" to refer to the amount of data read from a single disk before moving on to the next one in the array. Others may use this term to refer to the total width of the array which is better referred to as stripe width.
0
 
LVL 69

Expert Comment

by:ScottPletcher
ID: 20313251
>> If you set the database block size to 64K <<

I didn't realize you had a choice -- I thought SQL always used 64K.


>> the stripe size* to 64K and the NTFS allocation unit size to 64K then a single disk can satisfy a single I/O as long as you align the NTFS partition with diskpar(t). <<

But you will get faster reads from allowing multiple spindles to handle one read.  Also, I think now chained I/O can be used to get more than one 64K block in a single request.
0
 
LVL 55

Expert Comment

by:andyalder
ID: 20313406
>I didn't realize you had a choice -- I thought SQL always used 64K.

The rule applies to all databases - some which you can change the I/O block size.

>But you will get faster reads from allowing multiple spindles to handle one read.

A single read may be faster with a small stripe size but we are not doing single reads; the OS or database is likely to throw hundreds of read requests at the controller and wait for the result.

Take a look figure 8. at www.dell.com/content/topics/global.aspx/power/en/ps3q01_hou - (search for the 3rd occurrence of loadsim) . Note it is for Exchange which uses 32K database blocks so 32K stripe size is fractionally faster than 64K for that particular application.
0
 
LVL 69

Expert Comment

by:ScottPletcher
ID: 20313699
>> The rule applies to all databases <<

All due respect, other dbmses are irrelevant to this discussion: the poster's q dealt with SQL Server in a SQL Server forum.


I always recommend a 64K stripe size for SQL Server, so I have no problem with that.  

>> the OS or database is likely to throw hundreds of read requests at the controller and wait for the result. <<
Quite, which is why more spindles handling them yields better overall performance.
0
 
LVL 55

Expert Comment

by:andyalder
ID: 20313780
But there aren't more spindles in your 14 disk RAID 5 array than my 14 disk RAID 10 array.

Is it the bit about the RAID controller sending half the read requests to one disk and the other half to the other disk the part that you do not understand?
0
 
LVL 69

Expert Comment

by:ScottPletcher
ID: 20313859
But only 1/2 the spindles address any given read request on R10, whereas all spindles do on R5.  And the controller does not need to spend time & resources doing load-balancing.
0
 
LVL 55

Expert Comment

by:andyalder
ID: 20314279
Only one of the spindles address any read request with RAID 5, there are two available to satisfy that request with RAID 10. I am of course assuming fairly random data, with large BLOBs RAID 5 may have the edge.

The controller issue is pretty similar with RAID 5 as it is with RAID 10 under read conditions; with RAID 5 it has to work out which disk the data is on and put those requests into a disk queue, with RAID 10 it is the same but it has two queues to chose from. It's down to the individual manufacturer's firmware as to whether it sorts the queue so the high blocks go to one disk and the low ones go to the other so one disk has its head on the inside and the other on the outside of the platter thus reducing seek time.

I suggest we wait for TheNumberUK to post their benchmark results, there is no hard and fast rule for this since it is down to the individual controller.
0
 

Author Comment

by:TheNumberUK
ID: 22425082
It has been some time since i posted this and we have conducted extensive testing of both configurations.  Although we came out with a clear answer as to what was going to work best for us it also showed us that the configuration isn't just a straight forward 'what is the best RAID to have' question.  The factors going into how the data is going to be populated, used and maintained all impact on what is the appropriate choice.  Both RAIDS configurations on the our new platform exceeded the performance IO of the old platform but ultimately RAID 10 gaves us the best overall usage of the hardware in the production\maintenance time windows we have.  I have attached two JPEGs summerising the results we produced using the MS SQLIO utility, which ultimately we based our decision on.

Both tests were conducted on the same hardware using MS SQLIO.  Random IO types utilising 28 threads with differing IO request sizes querying a DB file size of 128GB.  

READ Results -

This was quite surprising as RAID 5 only showed about an 8-10% IO performance gain over RAID 10 accross all the measurments.

Write Results -

As expected RAID 10 gave a significant IO performance gain approaching 50% over RAID 5 and in some tests exceeding 50%.  


I think i said in my original post that our data is almost exclusively read, which it is, throughout the production day it is almost 95% reads.  But when we looked into some of the bottlenecks on performance we were experiencing, it came to light that the performance degredation we were seeing during the day was because the nightly uploads (writes) were not finishing ontime and had been overlapping into the production (read) hours.

So for us, by accepting a 8-10% reduction in overall IO READ throughput by discounting RAID 5 and selecting RAID 10 (the new hardware is giving significant increases of READ throughput against existing 5 year old hardware anyway which offsets this loss) we gained the IO throughput required for the lengthy out of hours write requests to complete timely and allow maximum READ throughput during the core production hours.

So for us RAID 10 was definately the answer but thats because it suits the data type and usage we have in the business.  If we had no nightly writes to worry about then RAID 5 would definately have been the way to go as long as the business accepts the reduced resilience you get with RAID 5.  The improved resilience of RAID 10 was an added benefit from a support point of view but was not a key deciding factor for us.  

Thanks for all your responses and i hope these stats backup what was being said throughout the thread.  

Matt  

 


READ-Throughput.JPG
WRITE-Throughput.JPG
0
 
LVL 69

Expert Comment

by:ScottPletcher
ID: 22427492
Thanks for the follow up info!  Interesting.
>> This was quite surprising as RAID 5 only showed about an 8-10% IO performance gain over RAID 10 accross all the measurments. <<

Well, 10% is actually a fairly significant difference over time. Basically that's what I would have expected.

And we all acknowledged that RAID 5 is poor performance for modifications, as you've proven again :-) .
0
 

Author Comment

by:TheNumberUK
ID: 22427916
I wasn't trying to understate the value of 8-10% IO loss over time, i agree this is significant.  What i maybe should have said was as we have seen a 35%+ gain in base IO performance with just using new hardware and switching to fibre connections over our older legacy hardware then a 10% loss of IO through the RAID configuration still allowed us a significant gain (25%+) overall.  

If we had to do this on our legacy hardware then the chosen configuration may well have been different as we couldn't have lived with a 10% loss of IO throughput.  

For me it was a very worthwhile exercise of actually being able to see how 13 spindles in a RAID 5 performs against 6 spindles in RAID 10 and hopefully these figures will help others make an informed decision or at least ensure that they do their own testing to see what is best for them.

If it hadn't been for the responses to my initial post that you guys made we may have just gone for what was widely perceived as the ideal configuration.  

There is no 'better' configuration between RAIDs 5 and 10, merely a configuration that suits your requirement.  I have read a lot (not on this forum i might add) of 'you should do this or that because its better or out performs the other', it's impossible to say that without benchmarking in your own environment.

So, thanks again guys your input and feedback has been very valuable.

Matt  

 
0
 
LVL 69

Expert Comment

by:ScottPletcher
ID: 22428175
Excellent points as well.

Thanks again to you too.
0
 
LVL 55

Expert Comment

by:andyalder
ID: 22428207
>13 spindles in a RAID 5 performs against 6 spindles in RAID 10
It's hardly a valid comparison if you have twice as many spindles, no wonder RAID 5 was faster if it had twice the number of disks.
0
 

Author Comment

by:TheNumberUK
ID: 22428274
it was 13 individual spindles against 6 pairs (13 discs against 12, this was highlight at the start of the thread), not 6 discs.
0
 
LVL 69

Expert Comment

by:ScottPletcher
ID: 22428338
??  Hmm, confusing.  It's the same physical 13 disks, right, but RAID 10 naturally requires they be split 6 and 6.

This is from your own earlier post @ 11.16.2007 at 09:15AM CST ::
"
>RAID 10 generally gives better I/O performance over RAID 5 but will the two sets of 6 discs mirrored in RAID 10 give better I/O than 13 discs in RAID 5?

Yes, especially with write since RAID 5 needs to do twice as many physical I/Os for a logical write than RAID 10 needs. Also with a disk failed RAID 10 knocks the spots off RAID 5 since it just reads the other disk in the mirror rather than reading all the disks in the array to calculate data from parity. Also rebuild after a single disk failure in RAID 10 takes minutes whereas with RAID 5 it takes hours. Plus you can use the 13th disk as a hot spare.
"
0
 
LVL 55

Expert Comment

by:andyalder
ID: 22428415
I see, 12 Vs 13 disks. The 13 should be 7.6% faster at read since one more spindle (you've proved it does balanced reads or it would be 50% down) and it was 10% faster so near as dammit equal read speeds for both RAID levels.
0
 

Author Comment

by:TheNumberUK
ID: 22428461
that is correct, it was the same set of discs used in the tests, 13 used in RAID 5 and 12 in the RAID 10.

I think i confused earlier when stating 6 spindles when it was 6 pairs.
0
 
LVL 69

Expert Comment

by:ScottPletcher
ID: 22428644
>>  and it was 10% faster so near as dammit equal read speeds for both RAID levels. <<

Huh???

The **original parameters of the q specified EXACTLY 13 drives**.  The q was, which would be faster for reads.

RAID 5 is clearly faster, so they are NOT equal.  

Yes, there's an "extra" drive, BUT THAT WAS A SPECIFIC GIVEN WHEN BEGINNING THE EVALUATION.  Of course RAID 10 could never be configured with an odd number of drives -- remember that "spare drive" comment :-) .
0
 
LVL 55

Expert Comment

by:andyalder
ID: 22428796
It was so long ago that I forgot what the question was Scott ;)
In case you're interested the old VA7100 ran RAID 10 with an odd number of disks; each data block was mirrored rather than whole disk pairs being mirrored.
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

this article is a guided solution for most of the common server issues in server hardware tasks we are facing in our routine job works. the topics in the following article covered are, 1) dell hardware raidlevel (Perc) 2) adding HDD 3) how t…
In this article we have discussed the manual scenarios to recover data from Windows 10 through some backup and recovery tools which are offered by it.
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
Viewers will learn how the fundamental information of how to create a table.

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now