Solved

My raid controller failed

Posted on 2008-06-19
46
2,644 Views
Last Modified: 2010-04-21
HI,
we experienced power outdage yesterday, and unfortunately I was not onsote to turn the server off properly. Anyway when turning the server this is the error I got:
Perc 4e/Di Standard FW
1 logical drive found on the host
0 physical drive found on the host
1 logical drive failed
1 logical drive handled by bios
0 physical drive handled by bios
Configuration of NVRAM and Drives mismatch
user configuration.......

then I press A to run the configuration utility
when I view the configuration
I see in RAID ch 1
0- FAIL A00-00
1-FAIL A00-01
2-ONLINE A00-02
 under logical drives configured I see:
LD         RAID      SIZE                 #STRIPES       STRPSZ       DRIVE-STATE
0              5        139760MB           3                     64KB           OFFLINE

I know I can click on each drive and bring them online I have seen someone doing that before , but would i loose any of the configuration.
what is my best way to put this server online or the windows up an running.
my other option in the menu is Rebuild and Reconstruct . I am really not good with arry so I dont want to try anything and loose everything.
Please I need help!!!!!!

thanks
David
0
Comment
Question by:taverny
  • 25
  • 17
  • 3
  • +1
46 Comments
 
LVL 95

Expert Comment

by:Lee W, MVP
ID: 21827522
You SHOULD be safe putting the drives online.  I would.  Then I would make sure you go buy a good UPS for the devices.

If prompted, I would read the configuration from the drives.
0
 

Author Comment

by:taverny
ID: 21827528
does it matter wich drive I put online first?
0
 
LVL 24

Expert Comment

by:purplepomegranite
ID: 21827550
I'd be concerned that two of the drives are being reported as FAIL.  However, your only choice really is to try and bring them back online to see if the array comes back.  Shouldn't matter which one first - one should bring the array up (as it is RAID 5), but once one is up bring the other up too.

Then, as leew says, go and buy a UPS!
0
 

Author Comment

by:taverny
ID: 21827571
ok , I thought I knew how to bring the drive online but I guess not. I haven't saved any configuration or modify anything when I go under management , objects, then physical drive . I see FAILL on the 3 physical drive

but when I go to management >view/add configuration>view disk configuration I see disc 0 and 1 fail and 2 online.
how do I put those drive online?
sorry for my ignorance.
0
 
LVL 95

Expert Comment

by:Lee W, MVP
ID: 21827588
I can't give step by step on that as I've not done it in years.  But there should be an array within the RAID Controller's BIOS that allows you to change their status to online.

Here - quoting jamietoner (slightly edited) from this ee question: http://www.experts-exchange.com/Hardware/Servers/Q_23446953.html

From the management menu select object then physical drive, this should show all 3 drives. Assuming the drive id's are 0,1,2 highlight drive 0 and press enter then chose force online do the same for drive 2.
0
 

Author Comment

by:taverny
ID: 21827596
ok I found  where to put online. when I click force online: this is the message that I get :
!!!WARNINNG!!! MAking a FAILED drive ONLINE will result in changing the logical drive state. this physical drive will immidiately start participating in data read/writes and this may lead to corrupted data
are you sure .......

should I still continue?\
Thank you so much for your help
0
 
LVL 95

Expert Comment

by:Lee W, MVP
ID: 21827682
Assuming you are doing this in the RAID controller BIOS, there should be little danger.  Things SHOULD work well assuming that the drives didn't truly fail and just went offline (as I've seen happen often enough with dell RAID controllers).  

HOWEVER, while *I* would do this, I also CANNOT with 100% certainty say that you won't lose data.  You SHOULDN'T, but you could.  Your only alternative is to spend $100 on RAID Reconstructor, install the drives in another system and rebuild the data to another drive.  (Assuming you also don't have backups of the server....).
0
 

Author Comment

by:taverny
ID: 21827727
Hi Lees
I just put online the 3 drives and rebooted the server.it past the RAID configuration . now I get the message :
WIndows could not start because the following file is missing or corrupt:
\WINDOWS\SYSTEM32\CONFIG\SYSTEM
you can attenpt to repair this file by starting window setup using the original CD-ROM
select 'r'' at the first screen to start repair.


does it mean that the HARDRIVEs ONLINE failed or should I just put the CD in?
0
 
LVL 24

Expert Comment

by:purplepomegranite
ID: 21827742
It means that the array is essentially back, but the power failure did cause some corruption on the drive.  The file it is referring to is in fact a registry hive, so you may want to recover that from a backup.
0
 

Author Comment

by:taverny
ID: 21827753
I only have backup for the file and database that I use, it's a small business with 6 workstation.
I guess I will have to try with the CD , what do you think. do you have a better Idea?
it is a window 2003 server with file sharing , that's it no exchange.
0
 
LVL 24

Expert Comment

by:purplepomegranite
ID: 21827769
You could boot to command prompt (recovery console) and restore the default SYSTEM hive, as that is what you'd do from CD anyway.  Having said that, there may then be more corruption elsewhere, so it would be worth running repair from the CD - especially as you are only using the server for filesharing (shouldn't be any complications).
0
 
LVL 95

Accepted Solution

by:
Lee W, MVP earned 460 total points
ID: 21827774
That means your registry is corrupt.  The hard drives are PROBABLY fine, but it's possible when the power outage occurred and your server shutdown improperly, that your registry was corrupted.  You'll now need to TRY to repair this - I REALLY hope you have a good backup somewhere... because that's what you'll likely need now.  And specifically, a SYSTEM STATE backup.  The rest of the data on the system is LIKELY ok... but once you get the system running, I would STRONGLY recommend running a CHKDSK /F on ALL drive letters.

As for recovering the registry, I would suggest you try to make a boot CD using the Ultimate Boot CD for Windows or Bart PE (ultimately, this is the EASIEST way to recover the system - especially if this happens again to this or any other system you work on).  Your alternative is to use the Recovery Console which is NOT NEARLY as user friendly.

What you basically need to do now is rename the existing SYSTEM file in c:\Windows\system32\config to SYSTEM.BAD.  Then COPY the SYSTEM file from C:\WINDOWS\REPAIR to C:\WINDOWS\SYSTEM32\CONFIG.  Once copied, reboot and cross your fingers.

The SYSTEM file contains all the hardware information about your system and as a result, when you put back the "REPAIR" copy, the system may well revert to a hardware config much like it was when it was first installed.  If you haven't done any changes to the system hardware, this may be ok (but that also includes 3rd party drivers, like Antivirus drivers and CD Burning drivers).  Once you can get the system booted into windows, do a SYSTEM STATE Restore and that SHOULD restore a much more recent copy of the hardware portion of the registry.  (By hardware portion I mean that when you edit the registry, any time you view/set/change a setting that starts off in the HKEY_LOCAL_MACHINE (HKLM) key, you are technically editing a setting in that SYSTEM file.
0
 
LVL 95

Expert Comment

by:Lee W, MVP
ID: 21827781
I would not run a repair install from the CD right now... Registry corruption due to a power failure is not unlikely.  Get the system back up, run the CHKDSK commands recommended, then do a system state restore.
0
 

Author Comment

by:taverny
ID: 21827918
well it doesn't look good:

I renamed the file and then copy the one from the repair folder. then rebooted it . the server did a check disk and then went into the windows logo after awhile it rebooted again , I dont see the loggin screen, I weent back into the console and try to do a check disck from there but now it doesn't accept my admin password.
what should I do?
0
 
LVL 95

Expert Comment

by:Lee W, MVP
ID: 21828058
I'd recommend you do what I suggested and make a BartPE/Ultimate Boot CD for Windows boot disk.  That will let you boot the system and run chkdsk on it.  You can also try the Windows Automated Installation Kit (WAIK) which includes Windows PE (but I haven't used Windows PE yet so I can't help specifically with that).
0
 

Author Comment

by:taverny
ID: 21828072
ok I am gonna download the file right now. Thank you again. I am really desperate to have this system up and running again.
when I boot with the disk would it ask me for a password, because my administrator password doesn't work anymore in the recovery mode?
0
 
LVL 95

Expert Comment

by:Lee W, MVP
ID: 21828111
No, using a BartPE/Ultimate Boot CD for Windows boot disk, you don't need any passwords.  You may have a bunch of things to do at this point to get the system running, but this CD will boot the system and allow you to run a CHKDSK on the drives (I did just remember - the boot CD has to have the drivers for your disk controller - it PROBABLY does, but if it doesn't, don't panic...
0
 

Author Comment

by:taverny
ID: 21828157
ok , If this all thing is working I serioulsy have to give you more than just points.
I am gonna bring this server home, since I don'thave a burner here.
I also realize I have another server that is acting as a webserver and I just check it is also setup as a domain controller would it help?
David
0
 
LVL 95

Expert Comment

by:Lee W, MVP
ID: 21828192
Depends... was this a domain controller?  Also, are there any backups for it?
0
 
LVL 95

Expert Comment

by:Lee W, MVP
ID: 21828198
I'll say this, if this WAS a domain controller, then its a good thing the other is one as well, ESPECIALLY if you have no good backups.  That will help you get things running a LOT faster if, in the WORST CASE SCENARIO, the server needs to be rebuilt.
0
 

Author Comment

by:taverny
ID: 21828211
ok I am going home and bring the 2 server home.
I keep you posted when I plug everything in.
Thanks again...
0
 

Author Comment

by:taverny
ID: 21828540
I finally got home and I am currently downloading the Ultimate Boot CD for Windows .
0
 
LVL 95

Expert Comment

by:Lee W, MVP
ID: 21828561
Wow... where is your home and where is the client?
0
Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

 

Author Comment

by:taverny
ID: 21828565
the client is about 35 min away but with traffic it took almost 1 hour
where are you located? I am in Chicago
0
 

Author Comment

by:taverny
ID: 21828578
ok I downloaded the file UBCD4win.exe , I have to install it on my machine in order to burn into CD ?
0
 
LVL 95

Expert Comment

by:Lee W, MVP
ID: 21828598
You'll need a Windows CD (XP or Server 2003) to make the UBCD - it's pretty easy though.  Run the setup and follow the instructions.

New York... wow... traffic at 10:30 at night for you... horrid... I thought only New York was that bad...  :-)
0
 

Author Comment

by:taverny
ID: 21828619
no, it's really ridiculous here it's traffic all day long.
I got the tutorial on created the disk I am copying the file right now.
Do you think it will work? I really need to have this server running for tomorrow. It's a server for a hair salon and they can't see the appointments, they have a custom software that is awfull to install but I do have the backup of their "database"( a bunch of file on one folder)
0
 

Author Comment

by:taverny
ID: 21828720
ok I just boot the server with the disk and I tried the option launch "the ultimate cood cd for windows" and I get the NTLVR corrupt.
0
 
LVL 95

Expert Comment

by:Lee W, MVP
ID: 21828732
Ok... I have to wonder than if it's something corrupt with the UBCD (my recent attempts at creating it resulted in the same error but I've recommended it recently here and no one else has had (or at least posted) issues).  I would recommend using the basic BartPE disk than - that's worked flawlessly for me lately.
0
 
LVL 95

Expert Comment

by:Lee W, MVP
ID: 21828748
Also, how exactly did you try to make the UBCD4WIN?  I just read over the FAQ and he recommends copying the source of the XP/2003 CD to the hard disk... (He doesn't mention this particular error, but he suggests copying the CD to the hard drive has fixed several odd issues).

-Lee
0
 

Author Comment

by:taverny
ID: 21828755
Lee,

I did copy the Win2003 CD to the local drive : c:\win2003 and followed instruction about the hidden files.
0
 

Author Comment

by:taverny
ID: 21828762
ok , it is building the cd again.
0
 

Author Comment

by:taverny
ID: 21828791
ok I booted with the CD and I have a go button on the bottom , what exactly do I do now?
0
 
LVL 95

Expert Comment

by:Lee W, MVP
ID: 21828798
Ok, here's my recommendations because, and I hate to do this to you, but I have to get going.

1.  When you get the system booted using the Ultimate Boot CD for Windows or Bart PE (UBCD4WIN uses BartPE - UBCD4WIN just has a LOT more tools you can potentially use).  Open a command prompt and run CHKDSK C: /F (confirm that C: is the system's hard drive).  Once done, do as I posted earlier and copy the C:\WINDOWS\REPAIR\SYSTEM file into the CONFIG folder.  Then reboot and see what happens.

2.  If it comes back up, restore a System State backup (if you have one).  If not, There's a GOOD (but not 100%) chance the system will still boot into windows but you may have to reinstall drivers and various software - let the goal be to get the event log to NOT have any red x's on a reboot.

3.  If the system does NOT come back up, you may well have to reinstall.  I would suggest, if the C: drive is large enough, you follow my directions for a fresh install with NO FORMAT (preserving all files on the drive).  See http://www.lwcomputing.com/tips/static/freshnoformat.asp

4.  If you have to do a fresh install, check with the other server - see if this server was a DC (check the Domain Controllers OU in active directory - was this server listed?  If so, MAKE A SYSTEM STATE BACKUP of the other server, THEN ONLY IF you are intending to rebuild this system, follow the directions here - http://www.petri.co.il/delete_failed_dcs_from_ad.htm - to remove this failed server from the domain (note: you may have to seize roles - see this link for details on that: http://www.petri.co.il/seizing_fsmo_roles.htm

I'm sure I'm leaving something out... there are many good experts on here... some in far away places like Australia... I'd suggest if you intend to work through the night and need further assistance, look up Jay_Jay70  - http://www.experts-exchange.com/M_3052984.html - he can probably pick up where I left off... and be awake to do it...
0
 

Author Comment

by:taverny
ID: 21828800
ok I found the menu >system > storage > check disk , then I type leter c: and fix error also.
but it run in less than a second and said done.
what should I do?
0
 
LVL 95

Expert Comment

by:Lee W, MVP
ID: 21828805
Note - There's a substitute "explorer" utility, I believe A43 - that will allow you to easily, through the GUI, make changes.  Otherwise, you can also open a command prompt and if you're familiar with command line commands, you can do everything there.
0
 

Author Comment

by:taverny
ID: 21828816
ok Lee,

Thank you very much for your help . I really appreciate all what you did. I am gonna try to follow your instruction and hoppefully tomorrow it will be a bad dream.

Thank you again.
I will keep you posted on the progress.
David
0
 
LVL 95

Expert Comment

by:Lee W, MVP
ID: 21828832
When running chkdsk you want to do it from a command prompt.  Start a command prompt (Run "cmd") and then type CHKDSK C: /F and let it run
0
 

Author Comment

by:taverny
ID: 21828840
ok I will do that , thank you so much. have a good night.
0
 

Author Comment

by:taverny
ID: 21828890
lee,
you probably when to bed , but the server just restarted. I tried to copy the system.bad back to system and reboot the server and it worked!!!!!!
0
 
LVL 95

Expert Comment

by:Lee W, MVP
ID: 21829001
Actually, I was leaving a client's office... home now... but have to sleep... have to move the car by 11:30am... I hate alt. side parking... anyway, Not entirely sure what steps you took, but as long as it's working, GREAT!... now make a system state backup NOW.  
0
 

Author Comment

by:taverny
ID: 21829002
Lee,
I just did a system state backup. and copy all the files that I needed in case ift fails again.
now I am gonna reboot again the server and do check disk
0
 

Author Comment

by:taverny
ID: 21829009
thanks lee, I guess we are in the same boat. I have to move my car by 9:00
and I have to drive back tonight to my client
0
 
LVL 6

Assisted Solution

by:akirhol
akirhol earned 40 total points
ID: 21830485
For future reference, you should never force all drives in a RAID array online at once in any situation besides a RAID 0. No matter how safe you think it might be, there's always a chance that one was failed before the other, thus it's data would be stale. And unless you know for sure it wasn't, you could have had one of those drives offline for months prior to the power outage and the server would have been running fine on a degraded RAID 5 all this time.

The proper procedure here would have been to check for proper size reporting and see how many errors show on the drive in the RAID BIOS (pressing F2 on a drive will show you this information). If one drive has more errors than another, force the drive with less errors online first and attempt booting into Windows. If that doesn't work, force it offline and force the other drive online and attempt boot again. If both are the same as far as what they display in the F2 information, pick and choose one at random to use first. If either of them lets you boot into Windows, you then rebuild the other drive.

Dell offers free phone support for the life of any enterprise system, regardless of warranty status, please make use of it in the future when you have situations such as this. You are more likely to recover with one of their techs on the phone as they deal with situations such as this multiple times a day.

As for the drives falling offline in the first place, obviously this is not intended behavior. Verify that you have the latest PERC 4e/Di firmware [v.5A2D] and check the firmware on the hard drives themselves [this can be done in the controller BIOS or Open Manage]. Alot of these PE2850s went out with Maxtor SCSI drives that have a JNZY or JNZM firmware, or Seagates with D702/703... these firmware levels do have timing issues when communicating with the controller. I would not be surprised if those drives had a firmware update out against them, but that just depends on what's in there.
0
 

Author Comment

by:taverny
ID: 21869285
Hi guys,
Lee thank you so much for your support. you have done a great job on having me back on line in no time. I really appreciate your concern.
 Akirhol,
thank you for your response; I didin't know about the support of Dell , I will make sure next time that I call them as well. you seem to know a lot about Dell array as well.
I will be asking 2 more questions in the forum. which I know you have the answers. ( how to create a good backup of windows 2003 , and how to monitor and be alerted when a raid controller fail or when one of the hard drive fail. )
I will link those questions to this post , like that you will be alerted when I create them. I will try to write them this evening.

thank you so much again!!!!!!
0
 

Author Closing Comment

by:taverny
ID: 31469017
Thank you guys. you really deserve the title "Experts"
0

Featured Post

What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

Join & Write a Comment

Suggested Solutions

In this article you will get to know about pros and cons of storage drives HDD, SSD and SSHD.
This is about my first experience with programming Arduino.
This video Micro Tutorial explains how to clone a hard drive using a commercial software product for Windows systems called Casper from Future Systems Solutions (FSS). Cloning makes an exact, complete copy of one hard disk drive (HDD) onto another d…
In this tutorial you'll learn about bandwidth monitoring with flows and packet sniffing with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're interested in additional methods for monitoring bandwidt…

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now