Raid 1 in slot 0,WIndows 2003 Server.

I have an HP Prolient ML 370 G4.  One of the hard drive's went bad (Slot - 0).

C: Drive is mirrored (Raid 0-1) in hard drive cage slots 0-1.
D: Drive (Data Drive) is configured with a Raid 5 array, in hard drive cage slots  2-4.

The Hard Drive Cage has a total of 6 slots (0-5).

Please reference http://www.experts-exchange.com/Storage/Hard_Drives/Q_25468411.html

Today, we had a power failure and this one specific server does not want to come up after the shutdown(power failure).  It would try to boot up and it could not find the OS.  So I turned the server off, unplugged the power and removed the bad hard drive and inserted a new hard drive (same make/model/size/firmware which we received just a few days ago). Then powered the server on.

Now, the server gets to the point where it says "Windows Server 2003" (and the Left-Right scrolling bar underneath) and then a black screen comes up and that’s it.  About 40 minutes later a Blue screen comes up that states.

Stop C0000218 {Registry File Failure}
The Registry cannot load the hive (file)
\systemroot\System32\config\software
or its log or alternate.
It is corrupt, absent, or Not Writable.

Begin dump of physical memory
Dumping physical memory to disk ##

Then the server restarts itself.  It has done this restart 3 times in a row(every 40 minutes).    I do not think the new hard drive is getting the data correctly.

There is no need to use this server today; but, I am sure they would like it up soon.

1.  I powered down the Server (HP Prolient ML 370 G4).  
2.  I un-plugged the power cable.
3.  I pulled out the bad hard drive (in slot 0) that was now showing a solid amber light).
     a.  Before it  was a flashing amber light.
4.  Now when I turn on the server, I see the new slot 0 hard drive (have flashing green light).
     a.  Slot 1 hard drive shows to have a solid 'green' activity light.

Questions:

1.  What can I do to try and fix this hardware problem?  

2.  Will I need to re-configure the array for slots 0-1 (C: Drive Windows Server 2003)?
LVL 1
PkafkasNetwork EngineerAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Justin EllenbeckerIT DirectorCommented:
The new drive is getting what is on the other drive which is a corrupt software registry.  Thats the BSOD error you are getting is loading the registry hive.  You can try safe mode, but that probably will not work.  You will need to go into the recovery console and restore the registry hive if there is a backup on the server or you may have to go to tape or other backup media.  This is not a hardware problem it looks like the power outage corrupted your registry. Server does not have the system restore feature that the client OS has so without getting this file from somewhere else you are out of luck and need to restore from a backup.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
DavidPresidentCommented:
Strife is correct, you do have FILE SYSTEM/REGISTRY corruption.  This is different then RAID corruption.  However, You should attend to the lowest level first ... the RAID.  
 - Read RAID error logs if they exist.  You possibly had unrecoverable partial data loss if there were unreadable bad blocks on the surviving disk.  You probably did this to yourself if you never run consistency checks  (see my new paper on disk/raid - just made editors choice a few mins ago, so worth reading regardless ...
http://www.experts-exchange.com/articles/Storage/Misc/Disk-drive-reliability-overview.html

- If you had unrecoverable data, then it is more than registry, you lost chunk(s) of data files, or more correctly, some data files now have binary zeros in them wherever the bad blocks were.   Even if you do a recovery you do have data corruption.

- Make sure the rebuild completed, do a consistency check now.  It is possible that it could even repair damage (but need to know much more to tell you for sure, so won't say it will or it won't, just that it is possible)

- After the RAID1 is happy THEN go forward with suggestion above.  You don't want to build upon a shaky foundation.
0
PkafkasNetwork EngineerAuthor Commented:
How long should the recovery process (Raid rebuild) take?  
About 8 hours?

I began the rebuild at 9:20 am (Central time).   It just seems that every 40 minutes it restarts with the BSOD message.  How can I tell if I am making progress?  

If I am not making any progress I may need to reconfigure the raid-1 on sots 0-1 correct?
0
10 Tips to Protect Your Business from Ransomware

Did you know that ransomware is the most widespread, destructive malware in the world today? It accounts for 39% of all security breaches, with ransomware gangsters projected to make $11.5B in profits from online extortion by 2019.

Justin EllenbeckerIT DirectorCommented:
dlethe you make some very good points I shouldn't have assumed that the raid was checked and rebuilt properly.  This is a tough situation and when i have seen things like this unfortunately there almost always is corruption that cannot be recoverd hopefully you can get the raid checked out and then the biggest one is hopefully only the registry file is corrupt.
0
Justin EllenbeckerIT DirectorCommented:
The recovery can take that long if ther drive is large.  I have a system here that last time it failed it took 36 hours for the raid to rebuild.
0
Justin EllenbeckerIT DirectorCommented:
The drive that did not fail, Drive 1 will not have a solid light there should also be a build indicator.  You will want to boot into the raid bios and check the progress instead of rebooting the OS over and over again that may cause more problems, the bug check I believe will start and stop the raid build everytime the system goes back to post.  Next tiem it boots get into the Raid bios and you should be able to see a progress meter there.
0
PkafkasNetwork EngineerAuthor Commented:
This Server is restarting on its own, becasue of the physical memory dump.  Every 40 minutes.  Then I must press the 'F1' key to continue.
0
DavidPresidentCommented:
Well, there are actually several ways to recover data in this situation ... that doesn't involve a clean room. Once author started trying to reboot instead of letting the process finish, then the degree of difficulty and talent required to recover increases considerably approaches somebody who writes has/written RAID & disk firmware for years -- like me :)

You need to know what is happening, is there any way to look at what is going on verbosely?  Are problems data inconsistencies, unrecoverable read errors on source disk; write errors & timeouts on target disk?  All require different techniques, but all things considered, they are outside the scope of somebody who isn't familiar with the process, it just isn't practical to walk somebody through this.  

A 36-hour rebuild is not unusual, I've seen rebuilds take well over a week with SATA disks and appliances.

If this is really important stuff then you can do it manually but you need software i doubt you have and have to migrate the disks to a non-RAID controller, so just wait and let it finish unless you want to pay $$$
0
PkafkasNetwork EngineerAuthor Commented:
These are 72 Gb Hard Drives.  Not very large Hard Drives.

When I go inot the config array utility it states that Raid - 1 slot 0  Drive needs recovery.

Perhaps I will wait 1 day and see where we are at.  Every 40 minutes, that BSOD comes up.
0
DavidPresidentCommented:
You mean you are attempting recovery while it is booted?  NO. Let it recover under the BIOS.   Do not attempt to boot, you are just doing more damage.
0
PkafkasNetwork EngineerAuthor Commented:
It is booting y itself.  After 40 mines the BSOD comes up stating...

"Stop C0000218 {Registry File Failure}
The Registry cannot load the hive (file)
\systemroot\System32\config\software
or its log or alternate.
It is corrupt, absent, or Not Writable.

Begin dump of physical memory
Dumping physical memory to disk ##"

Like clock work (Every 40 minutes).  The I have to click on the 'F1' key to continue the boot up process.  The 40 minutes later the same thing.  That is why I do not think it is rebuilding the Slot 0 hard drive correctly.  I think that I will probably have to re-configure the C: Drive Raid - 1 array.

What do you think?
0
DavidPresidentCommented:
Is there a way to disable autoboot? Just take this disk out of the bootable device list, if you must.  You are corrupting it more and more every cycle.  (Not doing RAID damage, but filesystem damage)
0
PkafkasNetwork EngineerAuthor Commented:
This it will nto boot up at all.  it will say cannot find OS.
0
PkafkasNetwork EngineerAuthor Commented:
What if I just delete the array and then create it over again and re-instlal the OS?
0
Justin EllenbeckerIT DirectorCommented:
You can reinstall the OS if you have a good backup.  You will still have to wait for the drives to sync and still will want to run diags on the other drive as well.  No use installing a fresh OS to mesed up drive.  You will still need to be in the raid BIOS for this and I am not 100% sure because I have never tried it may not let you delete the array until it is finished with the rebuild.  When the computer boots there should be a stage where it is initializing the drives and it will say something like press Ctrl-S to enter RAID BIOS, this is where you need to be then it will never get to the OS and it will not try to bood or read the registry and it can rebuild in peace.
0
PkafkasNetwork EngineerAuthor Commented:
Bot HArd Drives ont eh Raid 1 Array were bad.  What are the chances of that.

I used the HP Smart Start CD to delete the old array and configure a new Raid 1 array.  That is where I found out the HArd Drive in Slot 1 was bad.  It was showing an '!'.
0
PkafkasNetwork EngineerAuthor Commented:
The feedback was good
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Storage Hardware

From novice to tech pro — start learning today.