Link to home
Start Free TrialLog in
Avatar of SMacaulay
SMacaulay

asked on

RAID 6

Need some information on RAID 5,6 or RAID ingeneral.  Had a problem this weekend with a RAID 5 implementation on a HP MSA 2012.  I had a drive fail and while the system was rebuilding with the global spare another drive failed.  

My question is can any RAID set survive this scenerio?  I can see where different RAID sets can handle multiple failures but can they recover from multiple failures at the same time or very close to eachother?

I am getting ready to rebuild the failed RAID set and looking for advice, should I use just RAID 5 again or implement RAID 6?  RAID 10 is not an option at this time.
ASKER CERTIFIED SOLUTION
Avatar of Toxacon
Toxacon
Flag of Finland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
having 2 drive failures is highly unusual, It can happen, but it's highly unusual.

Raid 5 with 3 drives, depending on how the drives failed, you may be able to pull the data off and spin it to another disk using ddrescue and linux/ubuntu  and just jack the drive into the array and continue the rebuild.

Losing sectors is not good, but you can survive it.
losing the drive so it stops spinning or is totally inaccessable is bad.

If the drives are intermittant, you may be able to blow the data to another identical drive using ddrescue on all failing drives and put them back into the RAID and let it repair itself, it just depends on the card and the extent of the damage.

Also, you could be having an issue with the memory or processing on the RAID 5 card if it's not working right, it will report the drive failing when the drive itself is fine any checksum type errors on reads are a calculated entity and if you've got a flaky ram chip/card you will get a reported failure even when the drive is operating normally.

with the MTBF being 50k hours on an average drive the odds of 2 dying within a week of one another is slim to none.

If the drives support smart, you could boot ubuntu and connect the individual sas drive to a USB interface or a Scsi card and and check the smart status to see the remapped sector counts, you may find that there's no problem with the drives, if that is the case, replace the card.
If you pull 3 image files from the 3 drives using Raid Reconstructor on a seperate unit, you may be able to recover 100% of your data by destriping the RAID 5 with the software from runtime.org.

I recommend you pull image files of the entire disk because if the drives are not reading well, you will be taking a very long time to get the data set back.
(it goes 3x as fast using 3 computers to pull 1 disk image each)
Avatar of Member_2_231077
Member_2_231077

Downside is that RAID 6 write performance is even worse than RAID 5's. You also have the option of RAID 50 which is a RAID 0 stripe of RAID 5s, that can suffer two disk failures but only if the failures are in different RAID 5 groups.
Hi,

If you are using RAID 5 then:

1. If the second disk that failed is hot spare then most likely your data still survive.
2. If you do not have hot spare and the second disk kaput, your data have no way to survive.

Try rebuild you RAID first, in case you got lucky.

RAID 6 give you another plus one for failed drive, it is similar to RAID 5 with hot spare. Some more advantage of RAID 6 is, its support for the large disk. However, teorically rebuilding RAID 6 will take more time as they have to do double parity check and it require you to have minimum 4 disk (3 for RAID 5)

Better in terms of performance and security is RAID 10, but since you not be able to do it by now, do consider other option of having backup of you data (including OS). Backup Exec or Arconis can give you peace of mind here. Of course you need to have big external drive or network storage then.

I like to share with you some interesting story of RAID 6, http://www.zdnet.com/blog/storage/why-raid-6-stops-working-in-2019/805

Good luck.