Solved

RAID 6

Posted on 2011-09-07
5
418 Views
Last Modified: 2012-06-27
Need some information on RAID 5,6 or RAID ingeneral.  Had a problem this weekend with a RAID 5 implementation on a HP MSA 2012.  I had a drive fail and while the system was rebuilding with the global spare another drive failed.  

My question is can any RAID set survive this scenerio?  I can see where different RAID sets can handle multiple failures but can they recover from multiple failures at the same time or very close to eachother?

I am getting ready to rebuild the failed RAID set and looking for advice, should I use just RAID 5 again or implement RAID 6?  RAID 10 is not an option at this time.
0
Comment
Question by:SMacaulay
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
5 Comments
 
LVL 8

Accepted Solution

by:
Toxacon earned 250 total points
ID: 36497514
RAID 5 has one parity disk, RAID 6 has two. It means that a RAID 6 system can tolerate two disks failing at the same time.
0
 
LVL 12

Expert Comment

by:coredatarecovery
ID: 36497549
having 2 drive failures is highly unusual, It can happen, but it's highly unusual.

Raid 5 with 3 drives, depending on how the drives failed, you may be able to pull the data off and spin it to another disk using ddrescue and linux/ubuntu  and just jack the drive into the array and continue the rebuild.

Losing sectors is not good, but you can survive it.
losing the drive so it stops spinning or is totally inaccessable is bad.

If the drives are intermittant, you may be able to blow the data to another identical drive using ddrescue on all failing drives and put them back into the RAID and let it repair itself, it just depends on the card and the extent of the damage.

Also, you could be having an issue with the memory or processing on the RAID 5 card if it's not working right, it will report the drive failing when the drive itself is fine any checksum type errors on reads are a calculated entity and if you've got a flaky ram chip/card you will get a reported failure even when the drive is operating normally.

with the MTBF being 50k hours on an average drive the odds of 2 dying within a week of one another is slim to none.

If the drives support smart, you could boot ubuntu and connect the individual sas drive to a USB interface or a Scsi card and and check the smart status to see the remapped sector counts, you may find that there's no problem with the drives, if that is the case, replace the card.
0
 
LVL 12

Expert Comment

by:coredatarecovery
ID: 36497569
If you pull 3 image files from the 3 drives using Raid Reconstructor on a seperate unit, you may be able to recover 100% of your data by destriping the RAID 5 with the software from runtime.org.

I recommend you pull image files of the entire disk because if the drives are not reading well, you will be taking a very long time to get the data set back.
(it goes 3x as fast using 3 computers to pull 1 disk image each)
0
 
LVL 55

Expert Comment

by:andyalder
ID: 36497595
Downside is that RAID 6 write performance is even worse than RAID 5's. You also have the option of RAID 50 which is a RAID 0 stripe of RAID 5s, that can suffer two disk failures but only if the failures are in different RAID 5 groups.
0
 
LVL 13

Expert Comment

by:khairil
ID: 36497605
Hi,

If you are using RAID 5 then:

1. If the second disk that failed is hot spare then most likely your data still survive.
2. If you do not have hot spare and the second disk kaput, your data have no way to survive.

Try rebuild you RAID first, in case you got lucky.

RAID 6 give you another plus one for failed drive, it is similar to RAID 5 with hot spare. Some more advantage of RAID 6 is, its support for the large disk. However, teorically rebuilding RAID 6 will take more time as they have to do double parity check and it require you to have minimum 4 disk (3 for RAID 5)

Better in terms of performance and security is RAID 10, but since you not be able to do it by now, do consider other option of having backup of you data (including OS). Backup Exec or Arconis can give you peace of mind here. Of course you need to have big external drive or network storage then.

I like to share with you some interesting story of RAID 6, http://www.zdnet.com/blog/storage/why-raid-6-stops-working-in-2019/805

Good luck.
0

Featured Post

Don't Cry: How Liquid Web is Ensuring Security

WannaCry is just the start. Read how Liquid Web is protecting itself and its customers against new threats.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Learn how the use of a bunch of disparate tools requiring a lot of manual attention led to a series of unfortunate backup events for one company.
Is your phone running out of space to hold pictures?  This article will show you quick tips on how to solve this problem.
This tutorial will walk an individual through the process of configuring basic necessities in order to use the 2010 version of Data Protection Manager. These include storage, agents, and protection jobs. Launch Data Protection Manager from the deskt…
This tutorial will walk an individual through setting the global and backup job media overwrite and protection periods in Backup Exec 2012. Log onto the Backup Exec Central Administration Server. Examine the services. If all or most of them are stop…

717 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question