Raid 5 Recovery Options

I was called in to look at a server which was unable to boot.   The server initially had two drives with the amber failure light on when they came in.  In trouble shooting they removed the drives and plugged them back in at which point both drive lights went green.  however they still could not access the data.

When I got there all lights were green but I could not access an of the virtual machines.  I brought the server down and ran HP Smart Start and it shows only 1 drive in an error condition.  The server is running ESXi 4 which is installed on a separate mirrored drive.    The server boots but it no longer sees the raid 5 datastore.   I clicked on add storage in the VMWare console and it sees the raid 5 array but says it is empty.  

At this point I backed out without making any changes and suggested they look at data recovery options.   They called one company which was recommended by a local computer supplier and was given a quote of probably $10,00-$15,000 dollars which is out of their price range.  

This is a small company and  no one was checking the backups, they just changed the tapes every day and  assumed the backup was running every night so there are no good backups to restore form .  

Are there any other options?
qvfpsAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

andreasSystem AdminCommented:
Oh Oh this doesnt looks good. At first and in general a RAID5 with 2 failed discs means the data on this RAID is gone.

There are some cases where is small hope so I will ask for more details about the RAID setup.

How many discs the RAID had in normal operation? Was there a hot spare?
Are there and logs from the RAID controller available, sometimes from the BIOS/RAID-BIOS there are logs visible, what are the messages there?

How is the CURRENT state of the RAID, what does the RAID controller say about the CURRENT state of the disks/Array.

If there were a hot spare that took over the first failed drive b4 the 2nd drive was failing there are small chances of recovery.

If the Disks are now not grouped as an array and managed as JBOD you still may have a chance to rebuild the array in some circumstances.

1st only one disc is really defective and the 2nd disc that was orange jsut dropped from the bus bus is still physical ok. In that case you need to MIRROR ALL the drives fro mthe RAID 5 to other HDDs. IT will all be  very time consuming and slow as you have to make as many images as the RAID has HDDs. On creating the images you also find out which of the drives are ok and whch gives reading errors. If there are really 2 discs that cant be mirrored anymore you are out of luck, the data is gone. If only one or no drive cant be copied, you may have luck on restoring. If all drives did mirror you can continue with the recovery process.

There is a software from runtime called RAID RECOVERY WIndows.

https://www.runtime.org/raid-recovery-windows.htm

which you may use to recover the data from the images you created before. There is another software from that company for linux formated RAIDs too.

https://www.runtime.org/nas-recovery.htm

or generic RAID reconstructor.

https://www.runtime.org/raid.htm

They also offer a RAID-Probe service where they manually investigate parts of the image files the software transmits to them to see if the images contain enough data to restore the RAID.

IF there were NO hot spare and the ARRAY now is in state RAID5 - OK (not degraded) there is NO chance of recovery anymore as on rebuild the controller may had overwritten all the data on the discs.

But you sill might try the options from above iamging the drives and the runtime software options as a chance is a chance even its tiny.
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
and this is why RAID 5 is no longer recommended.

http://www.reddit.com/r/sysadmin/comments/ydi6i/dell_raid_5_is_no_longer_recommended_for_any/

If the data is important, I would recommend, speaking to Kroll Ontrack, data recovery specialists

http://www.krollontrack.co.uk/

You may also want to use the following to scan the VMFS datastore, to see if it can detect VMs

http://www.diskinternals.com/vmfs-recovery/

http://vmfsrecover.com/

http://www.ufsexplorer.com/download_pro.php

We use VMFS Recover and UFS Explorer.
andreasSystem AdminCommented:
Protecting & Securing Your Critical Data

Considering 93 percent of companies file for bankruptcy within 12 months of a disaster that blocked access to their data for 10 days or more, planning for the worst is just smart business. Learn how Acronis Backup integrates security at every stage

qvfpsAuthor Commented:
It was an array of 8 146GB drives with no hotspare.
andreasSystem AdminCommented:
then see my post above, try to make 8 images of all 8 disks to a windows machine and try to run the raid recovery software. If professional recovery is out of the question. As every thing you try yourself can reduce the chances of professional labs to recover the data.
qvfpsAuthor Commented:
How would the cloning work.   Shutdown the server and remove each drive 1 by 1, clone it to an identical drive and replace the original back in the server.  Once all disks are done and replaced in their original locations run the recovery tools against them.   If they are then unable to recover their data can they send the cloned drives out for recovery?

Could they install Windows on the two unaffected mirrored drives and use that to try and recover the array or a Linux Boot Disk?
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
It looks like the following...initially two drives have failed in RAID 5, at this point, the RAID array has failed and is broke. It can only recover if a single drive fails.

The RAID array has become unusable. It's gone.

A mistake has then been made, and two drives inserted back, the RAID array cannot rebuild.

The big question, is the two disk failures, did they fail because of hardware failure, or surface issues? if so data is gone.

How would the cloning work.   Shutdown the server and remove each drive 1 by 1, clone it to an identical drive and replace the original back in the server.  Once all disks are done and replaced in their original locations run the recovery tools against them.   If they are then unable to recover their data can they send the cloned drives out for recovery?

Not quite, you create and image of all the disks, and then use the software to combine them all has RAID 5, and then mount and scan this image.

Could they install Windows on the two unaffected mirrored drives and use that to try and recover the array or a Linux Boot Disk?

It's possible you could use Windows on these disks, and the software I've linked to to scan the RAID array. and try and find the VMFS partition.

I would do this, worth a try.
qvfpsAuthor Commented:
So I would need another server with 8 open bays to clone and combine the drives?
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
You don't need 8 open bays, image one drive at a time.
qvfpsAuthor Commented:
Then I can combine them 1 drive at a time as well?   Does it need to be on a server with a similar raid card or doesn't that matter
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Most Recovery software can work, and combine the images from many disks in RAID.

You will need to select a product, and read the recovery procedure.

Standard SCSI/SAS controller can work, you only need a platform to be able to read the disk RAW.

This method, does take many hours to complete, and the success is low. This is why DR companies charge, because it's many man hours.

Links to software have been provided.
andreasSystem AdminCommented:
You save the images of the disks as regular files on a large HDD lets say a 2TB drive, ther you have enough space to install the windows OS for rescuing and for placing the images and for digged out data if successful.

You do not need 8 bays one bay is enough and the RAID controller on that machines does not matter it even does not need to have a raid controller. It just needs to be able to read the disk, e.g. needs the same interface like as the raid disks have (SAS, SATA, SCSI).

For your recovery and imaging you do not need identical drives.

Sometimes disks fall of a RAID coz of temporary failures in such a case even a double disk failure on a RAID 5 does not mean total loss of all data.
e.g. I already had a failure of 2 raid discs at once due to overheating in summer after AC breakdown in a tiny room with a lot of hardware.

After cooldown both disks were recognized by the controller again and the raid was assembled correctly again. As we found the system the OS hang with the partition unreadable (only IO errors on all operations and 2 drives with amber LEDS)
So in your case it really depends if
1st: 7 out of 8 disks are readable
2nd: on pull and replug the controller didnt totally re-created the volume and overwriting all data with a new raid volume.

Thus i was asking on HOW the current state of the RAID is (all LEDS green again or not and what the RAID-BIOS says. This gives hints if the disks may have been overwritten and if the disks are seriously broken as they arent detected or show other problems.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
qvfpsAuthor Commented:
Thanks for clarifying that.   All lights are green on the drive
andreasSystem AdminCommented:
RAID BIOS says what about the status of the RAID?
qvfpsAuthor Commented:
If I was to take a standard PC and purchase 2 SAS to SATA adapters and a 2 Terabyte drive would I then be able to clone the SAS disks to files on the Terabyte drive and run the recovery against them or do I need to have a true SAS interface to connect the drives to?  Does the 2 Terabyte drive need be SAS as well or could I just go with SATA?

If the above will work what software do you recommend for cloning the drives?
andreasSystem AdminCommented:
one adapter would be enough, you can mirror disc by disk. the internal disk does not matter what kind of technology its using. it can be anything from SATA SAS SCSI. you just store normal files on them.

For mirroring you can use HDD-Raw-Copy-Tool http://hddguru.com/software/HDD-Raw-Copy-Tool/
qvfpsAuthor Commented:
Thank you for your time and patience.  This has helped me tremendously in understanding my options for trying to do a recovery myself
Marshal HubsEmail ConsultantCommented:
Try recovering the failed RAID array using Linux. First connect the drives to Linux based PC and open terminal then type in the following commands:

Ubuntu@ubuntu:~$ sudo -i
root@ubuntu:~$ apt-get install mdadm
Select No Configuration
root@ubuntu:~$ apt-get install lvm2
root@ubuntu:~$ mdadm -Asf && vgchange –ay


If you’re unable to recover data this way then you can easily recover all the data with the help of Stellar Phoenix Windows Data Recovery-Technician. This software recovers all the data from RAID 0, 5 and 6 based arrays by virtually rebuilding them. Download the demo version of the software from Official website: https://www.stellarinfo.com/windows-raid-recovery.php
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Disaster Recovery

From novice to tech pro — start learning today.