What is faster - Raid1 or Raid5?

Got a big database (approximately 20Gb). Software allows to split this database  between several logical/physical drives.
I have  6 SCSI drives.

Thinking about 2 ways organizing that:

1) Create 3 Raid1 arrays out of my 6 hard drives. Split the database between those 3 arrays.
2) Create 2 Raid5 arrays out of my 6 hard drives. Split the database between those 2 arrays.

Need help choosing the best way out of this two.
The goal is to have the fastest possible way to access database from clients.
Any calculations or link to page about how to calculate the total speed for each variant will be the best help.

Who is Participating?
jdlambert1Connect With a Mentor Commented:
RAID 1 just mirrors from one drive to another, no performance improvement. RAID 5 give a performance boost. Both provide some fault tolerance, so I'd go with RAID 5. That's the most common implemenation as the best balance of price, performance, and fault tolerance.
I might mention that RAID 0 is the fastest, but it provides no fault tolerance -- if one drive fails, it's toast. Some folks, if they can afford it, set up two sets of RAID 0, then mirror one to the other to get the best performance and some fault tolerance, but the SCSI adapter has to support that configuration, and they charge more for that kind...
AlexC77Author Commented:
SCSI controller is not an issue, I can get any.
I need some examples with calculation if possible.
Improve Your Query Performance Tuning

In this FREE six-day email course, you'll learn from Janis Griffin, Database Performance Evangelist. She'll teach 12 steps that you can use to optimize your queries as much as possible and see measurable results in your work. Get started today!

Oh, and just because you have RAID 5 and the system will keep working if a drive dies, it doesn't mean you can safely ignore the drives. I was called in once and they couldn't understand why their system quit, because they "paid all that money" for a fault tolerant system. Well, the first drive failed months before and no one noticed. When the second one failed and the system quit, it was too late. Not surprisingly, these folks didn't have recent backups, either.

So, someone should check the "idiot lights" every day, or, if the drives and controller card/drivers support it, you can configure many systems to send an email alert when a drive fails.

AlexC77Author Commented:
This not a problem also. One of the techs  checking everything every day
Oh, one other thing. If you mirror two 100GB drives, you'll have 100GB of usable space -- you lose half for the fault tolerance wihtout gaining any performance. If you gang three 100GB drives in RAID 5, you'll have 200GB of usable space -- you only lose 1/3 for the fault tolerance while increasing performance significantly.
AlexC77Author Commented:
Thank you for that explanation, but I am familiar with different RAID options, and I know how to calculate space.
I need to know if there is any way to calculate the speed.
I don't know of any sites that provide statistics. That would vary a *lot* with every combination of drive brand, drive brand, SCSI controller brand/model, and configuration.

Since you said you could get any controller... If you haven't gotten the drives yet, you can consider getting a 3ware controller (www.3ware.com) and using less expensive IDE drives. Not only is it less expensive, they give superior performance -- they even provide a performance boost on RAID 1, because they made their controller smart enough to read half from one of the mirrored drives, while the other half is being read from the other mirrored drive. SCSI doesn't do that, it just reads from one of the mirrored drives unless that drive fails, then it reads from the other one.

3ware was established by Adaptec engineers who were unhappy with the lack of inteligent innovation at Adaptec, so they left and started their own company. They're doing a great job designing their systems. So why aren't other companies using the same great techniques? Patents!
I've been poking around, between posts, and still haven't found any web sites with useful stats. The only thing I've found at all only says how to calculate throughput by dividing the amount of data read or written by the amount of time it takes -- duh.

I'm a little surprised that some of the hardware vendors don't seem to have useful stats in comparison charts for marketing purposes, but I guess it's just because there are too many variables.
AlexC77Author Commented:
Can you post URL you found here?
jaycaConnect With a Mentor Commented:

This goes into great detail about performance and RAID configs for optimum SQL setup/ performance.
On a theorical view and with no limitation on the bus
- you read 2x faster than with just one disk because read can be done simultaneous from the first and from the second disk
- you write at the same speed than with one disk (you write simultaneous the complete information on two disks)

- you read 2x faster than with just one disk because you read half info from two disks
- you write 2x faster than with just one disk because you write half info on the three disks of your system

So in a performance point of view, RAID 5 is more powerful, BUT, all the operations of read / write need to be computed to reconstruct / split information so the quality of your RAID 5 card is very important !

On a fault tolerance point of view, nothing is better than RAID 1 : you do not need any other thing than one disk to recover all your datas, you can take a disk, put it on another server and it will work... Specialists in data recovery from disk after big crashes do all agree with that :
- if you have 2 disks that are broken (this is very frequent since disk failure are consequences of electrical chocs that are the same for all your system) and you must send your disks to a specialist to recover the datas, this is more simple with two identical disks (the sectors you cannot find on one are recover from the second) than with three complementary disks from a RAID 5 system where each byte is splited on two disks.

My recomendation is
- to have the best performances, choose RAID 5
- if you have a fault tolerance problematic, choose RAID 1


for details on different raid levels
>- to have the best performances, choose RAID 5
>- if you have a fault tolerance problematic, choose RAID 1

AFAIK, RAID 0 combines drives into a single logical unit, but can write to all contained drives (almost) simultaneously.  Thus you can have a 160G array comprised of 4 40G drives which works at almost 4x the speed of a similar 160G drive.  AFAIK, RAID 1 combines drives into a single logical unit, but writes the same data to all drives in the array (with not too much of performance hit for writing to additional drives).  Thus, you could have a 40G array comprised of 4 40G drives, which would give you triple data redundancy at almost the same speed of a single similar 40G drive.  AFAIK, RAID 5 combines these features (where does hot-swap figure in all this, BTW?), giving you the speed benefits of RAID 0 but also the redundancy of RAID 1 (but of course you need even more drives).  The speed of the array (if it's comprised of decent components) should be roughly the base speed of the component drives (since they should all be of approximately the same size and speed) multiplied by the number of striped drives (not he mirrored drives) in the array.

Hope that helps...

complexymetronConnect With a Mentor Commented:

> My recomendation is
> - to have the best performances, choose RAID 5
> - if you have a fault tolerance problematic, choose RAID 1

Erm, that's not exactly right.
Many people seem to forget about the fact that in a RAID 5 environment the write speed is dropping. That's because for each change of information in a slice (a stripe of data in the array), the controller has to read the parity of the slice first, calculate a new parity and then write it back to the array.
Read performance with RAID 5 is indeed better than with RAID 1, the more channels and drives you add.

AlexC77, you mentioned a setup with 3 RAID 1 Arrays and splitting up the database between the 3 Arrays.
I'd rather recommend to use a RAID 1+0 (also called RAID 10) array if your controller allows that. It's a combination of RAID 1 (two drives mirrored) and RAID 0 (one big array over all 3 RAID 1 arrays)
Theoretical write speed with 6 drives: 3x 1HDD
Theoretical read speed: 6x 1HDD
It won't get faster than that.

One other thing: if it's possible for the db to access the host drive (the RAID array) directly (i.e. with no NTFS or FAT filesystem on it) you can speed up access a bit - because NTFS always needs CPU time and extra accesses to the array just for adminstration.
Just create a partition on the array, DON'T format it and tell your db to use the partition completly. Some databases allow this.
RAID-1 will be about 50% faster than RAID-5 doing reads regardless of size

There are, in fact, two variations of RAID-1, RAID 0+1 and RAID 1+0. Without going into too much detail here (this is going to wind up being a long answer), RAID 0+1 involves creating a RAID-0 disk stripe first, and then mirroring that stripe's contents to an identical stripe. In RAID 1+0, the disks being allocated are mirrored in pairs first and then those pairs are striped. Looks and sounds similar but there are some serious technical differences. The bottom line is that 1+0 is better than 0+1.

RAID-5 is a form of parity RAID, where data is striped across all the disks in the RAID stripe (the collection of disks that make up RAID-5), plus one extra disk. This extra disk contains calculated values that are generated by applying Boolean arithmetic to all of the data on the other disks. (What I have just described is actually RAID-4. RAID-5 works the same way, except that it takes that parity data and stripes it across all of the disks in the RAID-5 stripe to improve performance.) Any lost disk in a RAID-5 stripe can be recovered through the use of the parity information.

RAID-5 writes slower than RAID-1 for several reasons, including all of the arithmetic that must be done every time a write is generated. What's more, in order to do the calculations, in some cases, data must be read from all the disks so that the calculations can be made. RAID-1 does not require any math or extra reading. The rule of thumb is that if your disks are going to do less than about 15-20% writing, then RAID-5 may be OK. Any more than that, and you should probably not do RAID-5.

When a disk is lost, and isn't that why you are looking at RAID in the first place, replacing a disk in RAID 1+0 requires copying all the data from the surviving copy of the failed disk onto the replacement disk. In RAID-5, all the data on all the disks must be read and the appropriate calculations made, before the data can be written to the replacement disk.

If you lose two disks at the same time in RAID-5, all the data is lost, and must be recovered from backup tapes. If you lose two disks in RAID 1+0, unless they happen to be both sides of the same mirror, the system will be able to recover your data without having to resort to backup tapes.

Please note that if your disks are in a hardware array, then the performance comparisons are likely invalid, since those arrays generally cache their data, hiding those performance issues
If speed is the only concern - Use RAID 0. Striping data over several disks is by far the best option for sheer performance. BUT - it has NO FAULT TOLERANCE!!! Dies one disk - then your data is gone.

If speed is a requirement and security is wanted - use RAID 10 (Explanation at http://www.acnc.com/04_01_50.html) - High speed and fair reliability. It can in some cases survive multiple drive failure. Drawback: Very expensive / High overhead / Tricky to scale.

If overhead and taking advantage of the disks is a issue - stick to good old reliable RAID 5. Speed is essentially decided by the disks, so its their output (and the controller of course) that will be the limitation. Put some monitoring software on the array and buy 1 or 2 replacement disks to have on the shelf.

If performance drops below a desired level you have a legit reason to request funds for a new toy! ;-)
Just to back others up, the sums are quite easy, RAID 1 does 2 physical I/O per logical write, RAID 5 does 4 physical I/Os per logical write, RAID 6 or whatever with double parity does 6.

Example, 100 I/O per second for each disk, 33% write, 66% read, 6 drives and assuming the controller is intelligent enough to read off either disk in RAID 1 :-

RAID 5,  600 / (.66 * 1 + .33 * 4) = 300
RAID 1,  600 / (.66 * 1 + .33 * 2) = 450
RAID 0,  600 / (.66 * 1 + .33 * 1) = 600
crazijoeConnect With a Mentor Commented:
<Any calculations or link to page about how to calculate the total speed for each variant will be the best help.>

You can caulculate the speed with different software. But each manufacture has different specifications for their drives. Different seek times, spindle speed, etc. It would be very hard to list data transfer speed of every configuration of hard drives in different arrays. Your best be to would be to check the detailed specification or data sheet of the different hard drives you are interested in. This will give you a base line performance of the drive.

some examples
If you use 6 U320 15k RPM drives in any configuration, the bottleneck will likely be the PCI bus.  Many server mainboards have multiple PCI busses.  I think it would make sense to have 2 RAID 5 (or 3 RAID 0 if you don't need fault tolerance) arrays with each on it's own PCI bus and controlled by it's own 64 (not 32) bit PCI controller card.  Without knowing the exact specs of everything from the drive (RPM, avg seek, SCSI/SATA, etc) to the controller to the mainboard to even the mainboard chipset, it's not really possible to give an exact answer.  All the suggestions above make sense so far...but for a more specific recommendation we'd need more specific information on the system.
Lotus30306Connect With a Mentor Commented:
Great article:

You Don't Know Jack about Disks
ACM Queue vol. 1, no. 4 - June 2003
by Dave Anderson, Seagate Technology

http://www.tweakers.net/reviews/432 shows the effects of PCI bus bandwidth bottleneck w/ benchmarks
fixnixConnect With a Mentor Commented:
and http://www.alliancesystems.com/products/moreinfo/Affecting_Raid_Performance.pdf gives some numbers you can use to calculate expected performance based on your bus configuration (speed, width, and number of).
RAID 0 and RAID 5 both support striped reads, which will give best "read" performance.  (As far as I can see, an intelligent controller could do this with RAID 1 as well, but I know of nobody who makes a controller that implements this....)

Because RAID 5 needs to calculate and store the redundancy information, database writes will be slower than with RAID 0 or 1.

So you need to determine whether reads outnumber writes by enough to cover the write penalty with RAID 5.  This is common, but may or may not be true in your particular case.

". SCSI doesn't do that, it just reads from one of the mirrored drives unless that drive fails, then it reads from the other one."

Did I read that right???  Hm, I haven't seen a SCSI controller that works like that...
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.