Link to home
Start Free TrialLog in
Avatar of glennx
glennx

asked on

Dell Power Edge 2950 Raid 1 degraded and won't come back online

We have a Dell PowerEdge 2650 server with a failed RAID 1 drive. The drive has been removed and a new initialized drive has been put in its place. When I open server administrator the virtual disk is shown as 'Degraded' and it is not rebuilding. There is no disk activity at all. I have already tried to make the new disc a hot spare, but it still says ready and won't rejoin the virtual disc? I don't see a way to force the rebuild, there is no option like there should be? Does anyone have any ideas?
Avatar of David
David
Flag of United States of America image

Is the capacity of the replacement disk >= capacity of the one it replaced?  I.e, same make/model/firmware?
Avatar of glennx
glennx

ASKER

yes it is
Avatar of glennx

ASKER

the only difference was that I took it out of an identical server, wiped the disc and then replaced it with the bad disc in my degraded server. So the drive wasn't brand new but definitely worked.
that is the problem.  It has metadata on it from previous controller. You're lucky you didn't cause it to get confused.

Yank it out, put the disk is a NON-RAID controller and blow away both the first 128 KB at block zero and last 128KB at end of the disk, or just wipe it completely.
Avatar of glennx

ASKER

I did first.  I put the replacement disk in a different server and reformatted it first, then popped it in where the failed drive was in my primary server..  Shouldn't that have done the trick?
Hi,

I think what dlethe want to say is put the disk back to it original server. Start the server, on the RAID controller setup, move the disk out from logical RAID to that it become stand alone without be member to other logical RAID setup. You may wish to format the disk after that.

Just make sure the disk is not a member of any RAID.
Avatar of glennx

ASKER

I will try to take the wiped drive out, and place it back in the original server, then try to take it out of it's raid.  Then see if it will rebuild itself onto the primary server.

I will let you know how that goes.  It's been formatted already so I shouldn't have to re-format it again.

Does metadata get removed when you boot the server to the perc 3/di controller then use disc utilities to format the disc or is their another way?  I did a low level format.
Avatar of glennx

ASKER

no go guys that didn't work either?
You need to use a program that does a full zeroing out of the first cylinder so that no meta info is left

Download the free UBCD as it has disk wipe utilities to do this


 UBCD Free with Disk tests and Vendor utilities
www.ultimatebootcd.com

I hope this helps !

Darik's Boot and Nuke to 2.2.6 beta (rebuilt with isolinux V3.86 to improve compatibility). is on the UBCD and that is what should be used in most cases
Avatar of glennx

ASKER

so I was able to get disc management to see it.  It's showing up as drive F now.  In the physical disc section of server administrator, both discs are showing up as online.  This looks better, however the virtual disc still shows as degraded.  My fear is that the new drive is still not part of the mirror.

Any ideas would be appreciated?
Avatar of glennx

ASKER

sorry, needed to refresh my machine.  I will try nuking the disc and see what happens.

Thanks...
Avatar of glennx

ASKER

I'm getting the same message still guys?  The physical disc says online, but the virtual says degraded?
degraded-virtual-mirror.docx
Avatar of glennx

ASKER

If anyone can help I would greatly appreciate it?
glennx ... on a PERC 3/Di, it should be sufficient to Initialize the new disk, then assign it as a Hot Spare.  To clarify another question ... there is no Rebuild option for a READY disk - you can only Rebuild a FAILED disk - assigning a drive as a Hot Spare will cause a READY disk to begin a Rebuild into the array.

You say that both drives are ONLINE now, but the VD still shows Degraded?  In OMSA, go to Storage, PERC, Information/Configuration (link at top of page), and select Rescan from the dropdown menu of Available Tasks for the controller.
Avatar of glennx

ASKER

Thank you very much.  I am almost a newbie here, so that was very helpful.
How about a screenshot of your Virtual Disks screen too?
Avatar of glennx

ASKER

I will try your suggestion and let you know in a few mins.

Thank you.
Avatar of glennx

ASKER

please see attached...
degraded-virtual-mirror.docx
Not your Physical Disks screen, but your Virtual Disks screen.
Avatar of glennx

ASKER

scroll down to the second page in the doc
Ah ... didn't think to look for a second page.  Let us know what happens with a Rescan.
Avatar of glennx

ASKER

Will do and also i did assign it as a hot spare originally, but it never rebuilt and i never saw the option to rebuild...  Strange??

I'll let you know after a rescan
Not familiar with Perc 3, but is it possible that while it says degraded, it is actually rebuilding ?

Avatar of glennx

ASKER

It's only a 33gb drive and it's been like that for over two days so don't think it is rebuilding but not sure how to tell?
It will say something like reconstructing or rebuilding with a percent-complete.  You could post a controller log ... OMSA, Storage, PERC, Information/Configuration, Export Log, that should shed some light on when it actually completed ... and possibly even what else is wrong.
Avatar of glennx

ASKER

It doesn't say that but I am trying to reinstall openmange now because I tried updating it to see if I could get any help.
Well, if it is already rebuilt, it won't say that now.  It would have said that for the hour or two it was actually rebuilding.

Before trying to reinstall OpenManage, did you run the Rescan?
Avatar of glennx

ASKER

yes, but the drive still said ready under physical and degraded under virtual????  What do you think I could try next?
Avatar of glennx

ASKER

sorry not ready, i mean online under physical
Sorry, just want to be clear here ... does the drive say READY or ONLINE (you and the screenshot said ONLINE - there is a big difference)?  

After selecting Rescan, it will ask you twice (I think - at least once) to confirm you want to do a Rescan.  The Rescan should take at least 10 seconds (up to 60).  If you don't think it ran like that, run it again.

Export a controller log and post it here.
Avatar of glennx

ASKER

let me try again, thx!
Avatar of glennx

ASKER

So now another problem arises.  My server now created another degraded raid with the second drive?

Please help?  See screen shots!
screenshots.docx
Avatar of glennx

ASKER

also log file if that helps
afa-0909.log
Avatar of glennx

ASKER

Under physical disks they say "ONLINE"

Under Virtual there is now 2 "Degraded Raid1"
I don't know when the first screenshot was created, but the second screenshot shows you did not rebuild the replacement drive into your RAID 1, but created a new/separate RAID 1 with that drive (or if it was part of a RAID 1 in its previous machine, its configuration was imported, showing that RAID 1 on this machine).

Delete the last RAID 1 Virtual Disk, Initialize it, then assign it as a Hot Spare/Failover.
Sorry i didnt get back earlier, was out.

You can NOT wipe metatdata out form behind a PERC controller.  As I wrote, "it  MUST be a NON-RAID controller."  I emphasized that for a reason. The PERC controllers will not let you low-level initialize the hidden area. If you put that disk back in the original, or even another PERC controller, you make things worse and put all your data at risk.  You would have had no problems getting it to work.

Frankly at this point with all the things that have happened, I would strongly consider taking a binary image before doing anything else. You must NOT do it with the PERC, you need a NON-RAID controller.

You don't have a quorum anymore, so you could very well be writing info from the replacement disk onto the new one & your existing metadata is screwed up. The safe thing to do is use another machine with scratch disk, buy a copy of runtime.org's  raid reconstructor (this is easy, but not necessarily cheap) .. take a binary image of the data and save it to a local drive.  Then reformat the 2 data disks, stick them in the PERC, build a fresh new raid.  Then install the RAID controller + 2 drives into the machine you licensed runtime raid reconstructor two, then image the backup of your data onto the logical RAID1.

Above is a sure thing.  (Provided you didn't muck up your data already). Other techniques may work, but this will work.

 
Avatar of glennx

ASKER

Ok, I did this and we are back to square one again.  Please let em know next steps.  See screenshots

Please let  me know next steps?
degraded-virtual-mirror2.docx
Avatar of glennx

ASKER

I ran DBAN on the disc before I installed it the second time yesterday.  Would that have striped the hidden area on the  disc down or would it still matter?
You can follow dlethe's plan - which may be your only option later - but what would be the normal practice, if you were just coming into this scenario, would be to insert the drive, Initialize it (from Physical Drives dropdown), then Assign it as a Hot Spare.  If that doesn't work, you have something else going on preventing the rebuild and restoring from backup on a fresh array may be your next step.
Avatar of glennx

ASKER

OK guys thanks for all your help.  We ordered a brand new drive as a replacement instead of using an old drive etc..  

I am going to install it Monday morning, initialize, assign hot spare, and and see if it rebuilds.

Thanks for all the help, I will keep you all posted!
Look, you have NVRAM in the controllers and metadata in a reserved area on your disk drives.   The metadata contains things from known bad blocks, to even start/end of logical data, as well as RAID parameters and device names. Plus you could be dealing with different record layouts because the metadata could have different revisions.  

So, what can I say, I'm being conservative.  The only absolute is that it is messed up.  I just spent last weekend with a client who had a LSI-based controller (different model) and fibre channel drives that basically went through this swapping & cross your fingers process due to busted metadata.  They lost one volume because the metadata from the other drives 'conspired' to override a good config with a stale one,  and it overwrote a disk with live data with mirrored data from another controller.  

Make a proper backup with a NON-RAID controller just in case.  I say 4:1 against having any problems, but if this was my data, I would invest the time to make an image copy.  You could have a list of blocks that are queued up for writing that must not be written, or it will damage live data, or damaged metadata, or a bunch of other things I don't need to get into.

Bottom line, at this point  take a image clone as a parachute, then slap in the virgin replacement & cross fingers, but then you can get some data corruption that you have no way of anticipate w/o access to some of the things I have.   Worse, next time you have a drive failure then the controller could make an incorrect decision based on wrong internal drive mapping, then it comes back to haunt you.

Best practice, if you value your data is to blow a way the LUN, build & initialize a new RAID1, then restore.

Avatar of glennx

ASKER

Guys,

New drive came in today.  I inserted it into the server, initialized the disc, and then assigned as a global hot spare.  Same result, the drive just reads ready and did not rebuild and assign it self to the degraded virtual array.

At this point you all recommend that I should just take an image of the OS and blow away and reconfigure a new array and then drop the image back down?

Also we don't have an image program such as Ghost ect...  Do you know of any freeware products I can download?

Thanks again...
ASKER CERTIFIED SOLUTION
Avatar of PowerEdgeTech
PowerEdgeTech
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
you can try clonezilla, http://clonezilla.org/
Avatar of glennx

ASKER

Ok I will give it a shot and let you know.

Thanks again...
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of glennx

ASKER

Great Thanks a lot!
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of glennx

ASKER

I have a second raid 5 array that is storing all data. This server is just our primary file server, so I will have no problems rebuilding that if need be.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of glennx

ASKER

Thanks for all your help guys, deleting the array and restoring from REDO backup was the key.
Glad you're back up and running :)