RAID 1 and 5 with parity

Hello EE

I would like some layman terms regarding RAID 1 and 5 with parity. Are these the most common levels in medium and large networks?

How many HDD do you need for each level? And how does this work?

I have searched online for the basic RAID information, but I need the breakdown in layman terms.....Thanks much
Who is Participating?
This might help:

RAID 10 is also very common because it combines the speed advantage of a RAID 0 stripe with the safety of a RAID 1 mirror.  RAID 6 is increasingly popular as it uses 2 disks instead of just 1 for parity so if 1 disk goes you are not vulnerable for the time it takes to acquire a spare and rebuild the RAID.

RAID 1 or 10 requires a multiple of 2 disks with a minimum of 2.
RAID 5 requires at least 3 disks.
RAID 6 requires at least 4 disks.

Broadly speaking, the more disks you use in a RAID the faster it will be.

What do you need RAID for?  (Database, general central file storage, or some other?)

For RAID 0 or 1 you can do the job in software, but hardware is preferred.  For other RAID you really need dedicated hardware, even if it is only a PCI card.
noxchoGlobal Support CoordinatorCommented:
Most simple and clear way of showing RAIDs, their advantages and disadvantages:
RAID1 is considered to be fault tolerance configuration as simple hardware mirroring is used. If one HDD is dead you can boot from another one.
By the way, if you want RAID disk which is easy to configure and maintain you might like to consider a Thecus box:
Improve Your Query Performance Tuning

In this FREE six-day email course, you'll learn from Janis Griffin, Database Performance Evangelist. She'll teach 12 steps that you can use to optimize your queries as much as possible and see measurable results in your work. Get started today!

The most common scenarios are ;

RAID 1 - mirror - mirrors the data between 2 disks, and if one fails the second (warm) disk becomes primary and carries on working.  Overhead - 50%, at least 2 disks required

RAID 5 - striping with parity - stripes data across all disks with a parity bit so if one disk fails, the system continues to work, overhead is one single disk out of your array - at least 3 disk required.

RAID 6 - as above with more parity - well explained by Martin above.

RAID 10 - Combines striping with mirroring.  This is basically 2 striped sets that are mirrored.

RAID 6 is becoming more popular as didks are getting cheaper, but a lot of people still use the following

RAID 1 - system (operating system)
RAID 5 - Data (data spread across the RAID)
RAID 10 - Database (the best solution for SQL, and the more spindles the better).

A typical small company would have a single server with a RAID 1 (2 disks), and a RAID 5 (4 disks - 3 for data and 1 hot spare).

If they used databases, they would have a second server hopefully running a RAID 10.

I would add some layman terms for RAID...

Most RAID uses the below terms or parameters :
- RAID controller : either hardware (a dedicated card installed in your server - usually in a PCIe or PCI-X bus) or software (the server OS handles the drives and shows a "logical drive" to the users that is made of many drives using some RAID features), the controller receive the IO operations from your server OS, evaluate this io to decide what is to be done on the drives that are under its control, send all operations to those drives, return some operations feedback to the OS; although a controller stores the definition of how your drives are to be handled (Which drives on which RAID levels using which Read/Write mode and which stripe size with which rebuild and scrub features)
- Stripe : a stripe has a defined size (depends on the RAID controller 4/8/16/32/64/128/256/512KB or even 1/2/4MB) and a stripe is the smallest unit of data read/written per drive
- a large Cache backed by a battery : a RAM cache on a controller allows it to re-order the IO to enhance the IO performance; the battery allows for a power outage to occurs without loosing the data integrity (A hardware card without a cache backed by a battery does not really offer a value against a software raid)
- Controller mode "Write through" : The controller directly writes to the drives without letting the data wait in a memory cache; This mode results in a write performance disaster with parity RAID (RAID 5/6/50/60)
- Controller mode "Write back" : The controller writes to its cache (which have to be backed by a battery) and delays writing to drives at a time where the stripes to be written are full and does not requires additional io to complete (Ex: A RAID 5 array on 3 drives with 64KB stripe requires 128KB to be written at once to avoid re-reading the current stripes to recompute the parity stripe)
- Controller mode "Read ahead" : The controller read the drives more than it is asked to optimize IO usages that are mainly sequential
-Rebuild option "Spare drive" : Most hw raid card allows to define a spare drive that can be used immediately after a drive failure to rebuild the failed drive
-Scrubbing : You have about 1 unknown write failure (Ex: a 1 written correctly which turn to a 0 when read) per 20TB and about 1 unrecoverable sector per 12TB read (for most "Desktop" class SATA drives); the controller may scrub the volumes and detect/fix those problems; the only controller I know to be really safe on this side is ZFS (software raid)

The RAID level debate :
-Parity RAID levels (5/6/50/60) are to be used for backup/archive/WriteOnceReadMany IO scenario because they terribly suffers from any random write activity
-Parity RAID levels (5/6/50/60) requires some tuning at design time (Stripe size + Partition offset aligned)
-RAID 0 "Striping" offers the best performance but no redundancy at all
-RAID 1 "Mirroring" offers good performance (some hw controllers smartly route their READ operations to lower the data access time : all inner zone to drive #1, all outer zone read to drive #2) and may enhance a bit their performance by tuning (Stripe size + Partition offset aligned)
-RAID 10 (mirroring drives, then striping them together) are the fastest choice in Database scenario and in all "random write" scenario

The SAS/SATA debate :
-Best redundancy is done through storage systems that has TWO controllers each accessing every drive through 1 cable...which means TWO interface ports per drive...and I never saw a SATA drive with 2 SATA ports on the high availability market has to use SAS drives in 2010
-SAS interface offers more features than SATA : yes and so what ?
-SAS 2.0 (6Gbps) and SATA III (6Gbps) are the 2010 standards
-SAS drives usually supports higher rpm than SATA...but most SSD are SATA based and the IOPS (where random IO scenario is the only one expected) world is ruled by SSD
==> You should better choose a drive by its size, rpm and UBE (Unrecoverable Bit Error) ratio than by its interface
Technically, you can do RAID0/1/10/5/6 with a single disk drive. Such a technique is not uncommon in the LINUX/UNIX world, and I do it myself sometimes (well, RAID1) on a single disk.  You would want to do this on a single drive to take advantage of the redundant data to prevent against data loss due to unrecoverable blocks.  You can also do this in case of RAID0, to increase throughput. But this is an advanced technique that most people don't know about (or certainly the people who responded earlier, as it was never mentioned).

I also take major exception in general in earlier posts that keep insisting that one buys RAID0 for best performance.   This is not always the case.   There are 2 metrics for performance.  They are throughput and transactions (I/Os per second).   Their relationship is inversely proportional.  Optimize for throughput and your IOPs decreases, and vice-versa. So RAID0 is best for SEQUENTIAL THROUGHPUT intensive loads.  

One can not blindly say a certain RAID level is best in terms of performance w/o considering the nature of the data, and tuning parameters appropriately.  One can, however, use the RAID levels/topology to describe redundancy levels and GENERALIZE performance characteristics.

RAID 0 on a single drive may strongly affect performance because you will loose the sequentiality of io.
Please keep in mind that reading 1MB at 120MBs (current avg throughput of any HDD) is about 8.4ms where a seek move of the HDD head implies an avg access time of 12ms on a 7200 rpm reading 1MB in a "Raid 0 single hd" 2x(12+4.2) = 35ms ... 75% more than "normal hd" 1x (12+8.4) = 20ms
Above is true in a pure sequential access.  Real-world this will never happen unless you are using a custom file system and data acquisition mode.  There is generally enough random I/O even in a "mostly sequential" I/O environment to require the heads to move all over the place.   Dividing a disk into two partitions, and configuring for RAID0 can be quite helpful, especially if using a multi-path topology like fibre channel.  I should have clarified that the benefits are greatest where you have multiple I/O paths.  Since this question was purely RAID-based, and not specific to any particular storage technology or operating system, then I mentioned the above.
To my understanding, multipath io (MPIO) requires 2 ports per drive each connected to a hw raid card (or Host Bus Adapter / HBA for our fayman dictionnary :-) but only 1 port at a time is used.
==> I don't see what this MPIO feature could offer offer in this situation...

May be you are confused by another classic : "short stroking" which effectively divides a drive to allow using only the outer zone of the drive where the throughput is the highest and the avg access time is lowered because the rw head of the drive is always in the same physical zone of the drive.
I think this thread is getting somewhat too technical considering that the OP asked for "layman's terms".

Single disk RAID doesn't do what most people want out of RAID, that being protection against total disk failure - which is what I would guess the OP is after as he's talking about RAID 1 and 5.

Argument about the obscurities of potential RAID performance bottlenecks is getting complicated for a beginner - and may not even be relevant to the OP's application.

OP - would you please let us know what you want to use RAID for so that we can advise accordingly?
There are obviously several experts here with significant amounts of hands-on RAID experience who should be able to help.
I agree Martin, but again, I just wanted to set record straight that RAID-level-such-and-such requires X number of disks.  Just one of my little pet peeves.  On more than one occasion this causes arguments like you-can-t do that, or performance will suck, or why-would-you want to.   The point is, that advanced techniques have their place, and are a better solution for some types of problems.  

I've won bar bets that you can do RAID10 with 3 disks :)
lazikAuthor Commented:
Great info everyone, I was looking for some laymans terms regarding RAID 1 and 5 and I got that, thank you all very much.
lazikAuthor Commented:
Thanks again!
I can see what you are saying dlethe, but for a beginner it's far easier to understand the simple multi-disk "pre-configured" types of RAID solutions.  That's what most of the off-the-shelf solutions are going to provide.  If necessary we can advise and explain if their particular situation requires something more advanced.  Most people looking at RAID for the first time just want protection against disk failure!
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.