Link to home
Start Free TrialLog in
Avatar of TommaX
TommaX

asked on

Comments regarding RAID 0/1 or 10

I needed some comments regarding a RAID setup that I would like to build...
I have six identical disks. I want to use the most fault-tolerant RAID level posssible, that will still give extremely good performance for a database application.

My thought is to use a RAID 0/1 or 10 - there is so much inconsistency in labeling these two levels that I am not exactly certain which is which.

This is where I could use some sorting out....I'm considering making three mirrored pairs, then stripe across the pairs. I'm assuming this is a RAID 0/1 - striped mirrors. Am I correct in thinking that (as long as both disks in any mirrored pair do not fail) that this setup could sustain three failed drives, and continue running; that the fourth failure finally will cause the failure of the array? Or, with this setup, will the failure of only one drive fail the entire array, because the stripe gets broken?

Any advice would be appreciated
Avatar of tabiv
tabiv

RAID 0+1 and RAID 10 are the same thing.
There is a website out there that suggests that RAID 10 is different (acnc.com), but they're wrong. They're explanation of RAID 0+1 is incorrect, and the graph for RAID 10 doesnt make sense if ya look at the striping.

Your first version is right. If you have a RAID 0+1 array with 6 drives you can lose up to 3 drives as long as you don't both of the drives in a mirrored pairs. RAID 0+1 is a RAID 0 array of RAID 1 drives, not the other way around ( a RAID 1 array of RAID 0 drives).

Remember the more drives in the array the more likely you will lose 2 drives.

RAID 0+1 gives you the best performance and fault tolerance for databases. Although you might want to make a RAID 1 array with 2 drives and a RAID 0+1 array with the other 4 drives so that you can split the I/O of the database and the tran logs.


Ted
IF you had 2 SCSI cards (preferably RAID) I would create Raid 5 and then mirror it.
If you are that concerend about losing data I think that is the best way to go
For busiess Raid 5 is the way to go, it is typically more expensive however.  Raid 0 and 1 have become more for the home use.
rrhunt28,

That's incorrect. RAID 0+1 is better than RAID 5, it just takes a lot more drives so is actually more expensive. We've actually spent a lot of money upgrading our largest database server from RAID 5 to RAID 0+1, because RAID 5 could not handle the disk I/O we needed.

It is more expensive to get a Raid 5 controller for home users, but the overall cost of raid 5 is cheaper than RAID 0+1. The performance is better on raid 0+1 too. A RAID 5 250gb array takes 8 36gb drives, a Raid 0+1 takes 14.

Ted
Oops forgot...

Gbdiver,

I've seen that call RAID 15 and RAID 51 before. Not sure which is correct (or both). Doing that would mean that you would have to use software RAID to mirror the drives, the server would take a decent performance hit if that's the case. I know that there are some high end (high cost) solutions that do this, but I havent seen a card that you can do it yourself that does (not that it's not out there, I just haven't looked). If you are refering to 2 SCSI cards that you know would support that, let me know. I'd interested in checking it out.

Ted
Avatar of TommaX

ASKER

tabiv,

I'm wanting to do basically the same thing you did, on a smaller scale though (14 drives?!) - move our databases off RAID 5 onto RAID 10, 0+1, 0/1, whatever. What I really know is the mind-numbing performance boost we got on a small four-drive array on one of our small servers after moving to RAID 10 from RAID 5, and I want that on our other servers...

I wanted some clarification before making the next move - upgrading our larger servers - and you gave me that :)  

By the way, to open up another can of worms - any suggestions on stripe size and allocation unit size? I'm thinking of an allocation size of 128k, and stripe size of 64k for the next server...

Grateful for your insight, tabiv
RAID 0 + 1 is faster than RAID 5, because the controller (hardware) or CPU (software) does not have to do the math to create a parity calculation.  Data is simply split based on block size.

If you are working with databases, you will get your best performance if you separate your log files from the actual table data.  Since the hottest disk activity is with your log files, you would definitely get the best performance putting them on their own, separate RAID 0+1 array.  You can then place your tables on a RAID 5 array to get the best use of your disk space, or on a RAID 0+1 array to get the best performance.  I would split them up into separate controllers as well.  That way, you are not losing performance by moving all of the data down the same channel.   You also do not lose the performance of software striping (the above arrays should be hardware based.)

Keep in mind that disk size and speed are important as well.  As a rule of thumb, a disk that is the same speed but twice the size of another will be half as fast.  For example, a 36 gig, 10000 rpm drive is twice as fast as a 72 gig 10000 rpm drive, and would be four times as fast as a 146 gig, 10000 rpm drive.  This is only a rule of thumb, because it does not take into account controller cache differences, platter coatings, and seek algorithms.

So, to get even better performance from our database, which has log files on a RAID 0+1 array and tables on a RAID 0+1 array, we can put out log files on an array built on fast, small disks, like 36 gig, 15000 rpm disks.  Log files don't get very large, because if they do, performance will suffer no matter what.  We can then put our tables on 146 gig, 10000 rpm disks and store a lot more data.

Stripe size should be set to a larger size if you are working with large block data.  If you are doing lots of small data blocks, like just names and phone numbers, then a smaller block size is a more efficient use of disk space.

RAID LEVEL    TYPE                                        COST                    SPEED                 DISK USE
---------------   -------------------------------------  ---------------------  --------------------  ------------------------
0                    striping- NO PROTECTION        low                       fastest                100%
1                    mirroring                                  medium                slowest               50%
5                    striping with parity                   medium                medium               about 80%
0+1 or 10       striping and mirroring              high                      fast                     50%
0+5 or 50      striping + parity + mirroring     high                      medium fast         about 70-80%
Avatar of TommaX

ASKER

durindil

I hear that matching the average total bytes/sec across the network to the server with the block size on the server's disks is the most efficient and fastest configuration - your comment on this?

I realize that many other factors play into good database performance over a network - this discussion with experts will hopefully at least put me into the ballpark, and give me a good starting point...

Appreciate all the contributors to this question  :)
For business use - RAID 5
No second answer.
RAID 50 - if costs fit you
If you are lookig for the most common config, raid 5 is good - it makes the best use of storage.  if you need every last bit of power, raid 0+1


--------------------------------------------------------------

Raid 0 offers the best performance.  Mirroring drives so you have 3 drives of data and 3 drives of the same data.  theis gives you 3 drives of storage, 3 data + 3 mirror

Raid 1 is striping, so you have 6 drives, if one drive fails, you are toast.  This gives you 6 drives of storage - but no error protection


Raid 5 is the most common, it is like raid 1, were data is striped across all the drives, but some data is repeated.  so if one drive fails, you just replace it, if 2 drives faile, you are toast.  Raid 5 gives the most storage 6 drives - 1 = 5 drives of storage.

Raid 0+1, you have 3 drives stripped, then they are mirrored to the other 3 drives.  this is the fastest solution because thiere is no calcuation on what bits to repeat like in raid 5.  this silution gives you 3 drives of storage (3 stripped and 3 stripped mirrored)


I've seen mixed things on the best strip size for databases. So I am not sure of the best answer. I don't think that there is one answer that covers everything though. It probably depends on the OS, database, storage array, and transport medium.

Yeah, I have heard about taking block size and packet size into consideration when you are chosing stripe size. But someone told me about that and I have never seen it in writing or anything. Plus that just made my head hurt, so I am not really sure what to say about that. but it makes sense.

One of the SQL server resource kit books recommended either 64k or 128k. I can't remember which or if it depended on what exactly you were doing with it. If you have access to the SQL Server res Kit check there. When we were expanding and optimizing our biggest database server (over a Terabyte in size) last year I went with 128k.

If anyone has some info on optimal stripe size for databases I be interested in hearing it myself.

Ted
MaKaVeLi_Da_DoN,

That is incorrect.

Ted
ASKER CERTIFIED SOLUTION
Avatar of durindil
durindil

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
One interesting issue is IDA or SCSI raid.  Of course SCSI is the fastest and best !

I am seeing more IDE and serial_IDE (ATA) drives.  These offer nice performance for the price.  These are often simple controlers with limited cache however and often ony offer raid 0 or 1, not both 0+1, and sometimes raid 5

I think IDE raid is a good solution for a low cost small office.

For a medium size branch office, SCSI raid is a good choice.

For enterprise, SAN disk like EMC, hitachi, xioTech, or the like is the way to go.  These are large disks that can be segmented for differant operating systems.

 
Actually, the large-scale vendors such as EMC now support Serial ATA in their mid-range arrays.  The Clariion CX200, 400, and 600 all support Serial ATA drives.  Reliability is much better with ATA drives than a few years ago, and the libe is beginning to blur.  Don't get me wrong, I just love the speed and bandwidth of a pure fibre channel attached drive, but IDE is definitely good for the medium sized office and even the enterprise.

As a matter of fact, I just finished a large implementation where I put in 4 TB of Serial ATA disk in a large backup solution.  The backups are spooled to the Serial ATA disk via the SAN, and then backed up to tape, so host backup windows are extremely short.
Avatar of TommaX

ASKER

durindil

What you do is way out of my league, with my little network....but everything you have advised is sound, and your comments and info are extremely good.

I will look into the the other factors, particularly cache on our controllers for our present systems, and look at serial ATA for the future - Thank you!

I want to increase points for your continued contribution to this discussion, and your expert skills  :)
First of all RAID 0+1 is not the same as RAID 1+0
Some vendor may not use the term correctly with their implementation but in most cases. they are different.
The main difference is how the RAID survive in case of a single disk failure, multiple disk failure and which disk fail. (which 'column')

RAID 10
http://www.acnc.com/04_01_10.html
Under certain circumstances, RAID 10 array can sustain multiple simultaneous drive failures

RAID 01
http://www.acnc.com/04_01_0p1.html
A single drive failure will cause the whole array to become, in essence, a RAID Level 0 array


RAID 5
Many RAID 5 implementation are getting better and better performance. Large cache (512MB) for each controller is not uncommon.

ATA drives array:

NEXSAN ATABOY/BEAST/etc.
Good startup that focuses on cost effective ATA and FC storage.  Support multiple platforms.

Apple Xserve RAID
2Gig fiberchannel backend and using ATA drives. $499 for their FiberChannel Card (LSI rebrand card).
This array has 14 drives and each of them are 180Gig. Two independent controller and each control half the array.  very fast and cost effective ($10,999 for 2.5TB). Their RAID 5 is very fast and in many cases it's as fast as RAID 0+1.

These are ideal for nearline storage such as Disk to Disk backup or Archiving.
TommaX,

Not that I am complaining, but I would have thought you would have at least split the points with me and durindil. Especially since I answered the question completely first, before the other questions were asked about stripe size. oh well...


HKchilam,

acnc.com's explaination of Raid 0+1 and Raid 10 is incorrect (as I already pointed out above). It's generally accepted that RAID 0+1 and 10 are the same, they are they only one's out there that I have seen portraying them as different. Even their graph of RAID 10 doesn't make sense.

Avatar of TommaX

ASKER

tabiv

I'm new here to Experts-exchange - splitting the points between you and durindil makes sense to me, and really was what I felt I should do - but didn't see an easy way to do that. I thought that I had to choose a single "more complete" answer - which isn't, and wasn't, fair to you when you answered first....please let me know how to do this - I'll give you the points you deserve...
I recall that using VERITAS volume manager (on unix) or SUN DISKSUITE, you can setup 0+1 and 1+0 differently and the result is different tolerance of disk failure as described earlier.

Example with any volume manager:
on hardware raid, get a bunch of RAID 1 disks (two way to keep it simple), present the LUNS to the host, use host level volume manager to stripe the disks. you get RAID 1+0. Depend on where the disks fail, you may be able to survive more then two disk failure.
-Bruce

on hardware raid, get a bunch of disks and create a RAID 0 lun. do the same and create another lun. present the lun to host and use volume manager to mirror the two lun. You get RAID 0+1.

If one disk die, you lost the whole stripe. You end up with a RAID 0 surviving mirror.
If you lost a second disk on the surviving mirror. you lost data availability.

-Bruce

Interesting discussion.

Of course you mirror before you stripe but not only for reliability (its very unlikely a second drive will fail before the replacement drive for the first failure has rebuilt) but more for performance during a failure and faster rebuild.

There are cards that do raid 51, not worth using unless you expect a shelf or channel failure and can suffer the performance of raid5 during an outage.

Points for tabiv, https://www.experts-exchange.com/questions/20691156/Points-for-tabiv.html



Ahh, I see where you are confused hkchilam.  RAID 0+1 and 1+0 are different, but the difference is only in how the data is sent.  If the array mirrors data and then stripes it, it is RAID 1+0.  If it stripes and then mirrors, then it is RAID 0+1.  What  you are describing in the VxVM is using a RAID 0 stripe in a RAID 1 mirror.  This is not RAID 0+1.  RAID 0+1 and 1+0 are integrated--which is why they can tolerate a failure of 1 every mirror pair.
Someone told me before never have a debate on RAID 0+1 and RAID1+0. Now I know why.
I am done. :)
Avatar of TommaX

ASKER

Thanks for all of the discussion!

I've been away for several days - my wife and I delivered our first child together, a *perfect* little girl!

tabiv - reading back over this discussion, I really feel that you should have points, as well as durindil - I see that andyalder has given you his points for your trouble - hope that's satisfactory to you, and makes you feel special :) as well....

durindil - thanks for all of your good advice! I see that the only difference is in how the data gets sent to the drives. As long as the array will sustain multiple drive failures, and is fast, and can be rebuilt relatively easily and quickly, I'm confident with it...thanks!
Here is some trivia for those inclined to know.

RAID 0+1 is in deed the same as RAID 10. The term RAID 10 was intorduced to the market by some marketing peron at Compaq - and is called RAID 6 (or was) by the marketing folks at HP. But they are all the same...

What would the marketpalce do without marketing folks...
Actually RAID ADG is better than any of the above solutions for Data protection.  Performance will always take a hit when you have increased protection but ADG has better performance than 5 and better protection than 1, you can lose 2 drives on the same stripe and still recover.  You need minimum of 4 disks and only have 50% of the total space but you can add additional drives singly and get increased capacity on those drives.
I think you'll find RAID 6 (or ADG as HP/Compaq call it) is slower than any other RAID level, 6 I/Os per write as opposed to 4 for RAID 5 and 2 for RAID 10, http://h18004.www1.hp.com/products/servers/proliantstorage/arraycontrollers/adg/questionsanswers.html "RAID ADG has equal performance to RAID 5 when reading data but is slower when writing data due to the extra parity data that RAID ADG writes".
I believe I covered that:

>>>Performance will always take a hit when you have increased protection
Oopps sorry I did say it had better performance than 5, have to check on that
Best benchmark stuff I came across finds that it may very slightly degrade write performance and can slightly to moderately improve read performance.  So the hp blurb probably errs on the side of better safe than sued.