Solved

RAID 5 - unique situation and question

Posted on 2013-12-07
11
337 Views
Last Modified: 2016-11-23
Hey guys.. time is critical on this one.. any help appreciated.

Without too much explanation, here's what I have:

Dell Perc Raid Controller.  2 drives from a RAID 5 array that are perfectly intact....

EXCEPT...

The second drive (drive 3 is totally toast) was imaged from the original drive sector by sector with no errors, so it's an exact copy.

In theory, I should be able to use those 2 drives - and I have already been able to do so, but only in a limited fashion - the software I have to rebuild the array only lets me save files, or make an image file, but will not let me clone the array directly onto another drive for booting purposes.  The problem is that image creation takes about 14 hours.  I just finished a 14 hour stretch and after that was done, the destination disk that contained the image went belly up on me (it was a 3 TB drive).  We're talking about a large amount of data, so imaging and reimaging is just too time consuming and this server needs to be up by Monday morning!

I wanted to put the good drives back into the server, but the problem is that the RAID controller sees the perfectly good copy of the failed drive as a drive that doesn't belong to the array.  It's a different model and serial number drive than the original, and my guess is that the raid config is sensitive to this.  

IS THERE ANY WAY... to trick the raid controller into using this newly copied raid member and allow me to boot this server and perform a backup?  Hex editor on the drive to fix a serial number problem possible?

Please don't suggest I image it again.  I don't have that many hours left to do that.  Suggestions are welcome but please hurry!

Thanks!!!
0
Comment
Question by:TimFarren
  • 5
  • 3
  • 2
  • +1
11 Comments
 
LVL 95

Expert Comment

by:Lee W, MVP
ID: 39703302
So first question (and I suspect I know the answer but I'll ask it anyway), why not restore from backups to a new RAID set?

Second, how many drives were in the original RAID 5 config?  3?  If so, I don't understand why you're going through all these hoops.  You should be running fine in a degraded state.  Replace the drive with ANY drive that's the same size or larger than the original and tell the controller to rebuild if it doesn't do it automatically.

Third, this data is CLEARLY important... understand that you're asking for tips and ideas concerning a LARGE amount of important data from the internet... we're good... but we're NOT THERE and I find it doubtful that ANY true expert if going to make promises that their recommendations have ZERO risk.  Anything you try that we might suggest COULD very well cause a catastrophic data loss.  If this data is as important as you say there should be backups and if there aren't then clearly once you have a method that works you should recover it using that method because any other method could lose EVERYTHING.  Once you recover everything, if you want to spend the time finding a better way, go right ahead, just in case this happens again.

So, third having been stated, I've had great success using the PAID version of RAID reconstructor to rebuild the data off failed RAID 5 sets onto a disk that was perfectly readable in NON-RAID form.

Lastly, to reiterate, you have a known successful method - use it... you may not get sleep tonight, but most experienced admins have had those nights (myself included).  Accept this and tomorrow (or the day after) take the lessons learned and plan for such failures in a way that will let you get sleep at night.

(I'm really not trying to be me offensive - I've been where you are - not EXACTLY the same circumstances, but similar ones with a software RAID and Microsoft... and I know how VITAL data can be.  So I'm VERY VERY conservative advising anyone to do anything that could permanently lose the data.  And I'm not there, I don't know the RAID controller firmware, the EXACT RAID config and frankly, I've not used many CURRENT Dell controllers for anything other than setup (they've been really good since old style SCSI died)).
0
 
LVL 95

Expert Comment

by:Lee W, MVP
ID: 39703310
Ok, I missed some details in my reading of that question in the email... It's a 3 drive RAID...OK.  BUT, why are you using software to recover the RAID... why are you not just letting the controller run the array in a degraded state?  And why are you using a duplicated drive (and HOW did you duplicate it?  What software did it?  dd?  Acronis? imagex? something else?  (Some of those MAY NOT WORK!))
0
 
LVL 2

Author Comment

by:TimFarren
ID: 39703348
Let me clarify a little:

1.  3 Disk array.
2.  Sequence of events:  Out of drives 0, 1, and 2 - disk 1 failed.  We replaced the drive with a new one and began a rebuild.  During the rebuild, disk #2 died before the rebuild could complete.  This left us with an inconsistent array.  
3.  NO BACKUP.. I know.. I know.. moving on...
4.  I used a high end disk imager to perform an exact sector copy from one drive to another.  I was able to recover all sectors to a new identical disk.
5.  I was able to rebuild the array using software and correct parameters, and I see good data - but I have no bootable server.  
6.  I attempted imaging the array onto an image file for later reimaging back to a fresh array, but that process takes about 14 hours.
7.  I did attempt to put drives 0 and 2 back into the server (leaving out the partially build disk 1) but the perc controller refuses to acknowledge the duplicated disk as the original (probably due to the serial number on the drive).
8.  I attempted to rebuild the raid array on the controller side, but I can't get the controller to cooperate. Is there a way to trick the controller into accepting the replacement?  If I can get the controller to do the heavy lifting of raid, I can start a backup onto a usb drive and then do a restore.
0
 
LVL 55

Expert Comment

by:andyalder
ID: 39703392
Assuming you have a forensic copy of all the disks that you can fall back to if needed I would check to see if the option to import the cloned disk using the import foreign option. In BIOS I think that's found by highlighting the controller and pressing F2 but they do hide it quite well. The timestamp on the metadata for this disk will be wrong so it should appear as foreign. This will not work if the OS has been run in between the disk being removed and cloned and being put back since the data on it will be stale.

A 2nd drive failing during a rebuild isn't at all unique by the way although it is lucky you managed to make a copy of it presumably sector-by-sector. As leew says RAID reconstructor can recover the data because it will ignore the metadata, but presumably you have used something similar which took the 14 hours.
0
 
LVL 2

Author Comment

by:TimFarren
ID: 39703461
I'm trying to avoid this (I've already tried it and have wasted over 28 hours):

Rebuild w/software RAID--> Image FILE --> Image file back to new RAID Array.

I'd rather do this:
Drives attached to server, boot with Usb or CD using software capable of dumping the rebuilt array onto a new array - image rebuilt array directly onto new array...

I have been unsuccessful in finding any software which will dump the recovered array to a disk rather than to an image - it seems like such a short distance between either method programmatically.  I'm willing to pay for software that will do it - but can't find it.

The suggestion about pulling the drive in as foreign - no, the controller doesn't see it as foreign.  My guess is that the drive has contradictory data on it (if the RAID info on the drive contains drive serial numbers, then it doesn't even have it's own serial number included in the config - I wondered if a hex editor would allow me to alter the raid config stored on the drives and fix this issue?)

anyway thanks for the continued help.  I'm running out of time, but I do appreciate everyone's input.
0
Don't lose your head updating email signatures!

Do your end users still have the wrong email signature? Do email signature updates bore you or fill you with a sense of dread? You can make this a whole lot easier on yourself by trusting an Exclaimer email signature management solution. Over 50 million users do...so should you!

 
LVL 55

Expert Comment

by:andyalder
ID: 39703602
What does the controller see the drive as then? Can you post a screenshot (or photo) from BIOS with the drive visible.
0
 
LVL 2

Author Comment

by:TimFarren
ID: 39703665
It just sees it as a new disk.  It doesn't associate it with any array.
0
 
LVL 46

Expert Comment

by:noxcho
ID: 39705570
That is correct, the drive does not have a record about being part of the RAID at the very first track of the drive. And there is unfortunately no way to put this record there.
Normally there are two approaches of storing such info. First and rare one is that RAID controller has its own memory and stores the information about RAID group mates in it.
Second and which is widely spread - the info is stored on HDD itself. Thus if controller fails you can move the drives to another controller of the same model and you are good to go.
In your situation the only way out would be complete restore from the image.
I assume you were using RAID Reconstructor - right?
0
 
LVL 55

Expert Comment

by:andyalder
ID: 39705584
A forensic copy will copy the metadata as well as the data though, and a PERC 5 or 6 stores the config in the metadata and on the controller.
0
 
LVL 2

Accepted Solution

by:
TimFarren earned 0 total points
ID: 39733826
In the end I got a working server like this:

1.  Instead of using data recovery software to dump to an image (like using Raid Reconstructor for example) I used R-Studio to build the RAID Array and dump all files to an external hard disk including security and attributes.
2.  I installed the OS from CD on a brand new blank RAID Array on the server constructed with new hard disks.
3.  After completing the install, I booted with a windows CD and moved all folders created by that OS install into a folder named BAK.  
4.  I used that same windows boot CD to ROBOCOPY all files from the external HD onto the new RAID array including security and file attributes.
5.  I booted with the windows 2008 CD and performed a repair to fix the boot loader inconsistencies.

Bingo.. working server.

Thanks for your input folks.
0
 
LVL 2

Author Closing Comment

by:TimFarren
ID: 39739805
My solution got the job done.
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Possible fixes for Windows 7 and Windows Server 2008 updating problem. Solutions mentioned are from Microsoft themselves. I started a case with them from our Microsoft Silver Partner option to open a case and get direct support from Microsoft. If s…
A procedure for exporting installed hotfix details of remote computers using powershell
This tutorial will show how to push an installation of Backup Exec to an additional server in both 2012 and 2014 versions of the software. Click on the Backup Exec button in the upper left corner. From here, select Installation and Licensing, then I…
This tutorial will give a short introduction and overview of Backup Exec 2012 and how to navigate and perform basic functions. Click on the Backup Exec button in the upper left corner. From here, are global settings for the application such as conne…

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now