Dell PE T310
PERC S300 (Yes, I know this card sucks)
Came in 2 days ago and the server was down. Check the drive status - shows 2 of 3 drives in the RAID 5 array are down (Drives 0 and 1). I Reboot the server, which then shows that 1 drive is out - I go into the BIOS which reports that Drive 0 is "Ready", drive 1 is "Online", and drive 2 is "Spare" (Which it should never have been). I allow it to continue to boot. I don't change anything.
Windows boots!! But, after about 5 minutes it crashes and reboots, at which point it performs a disk check and whatnot. After it cycles through this twice, Windows now refuses to boot all the way. Now, when it attempts to boot, I get the regular splash screen with the moving bar for about 3 minutes, then I get a black screen with a mouse pointer on it. The mouse pointer is moveable with the mouse, but after about 1 minute the system crashes and begins a reboot. There are no error messages or BSOD - just a reboot.
I call Dell to go through all of their troubleshoot steps, which didn't go very far until they said we need 2 new disks and a controller card, and we will need to start from scratch and rebuild from a backup. I go through the motions to get the new hardware sent out, but I continue to attempt to save this thing.
Now, I am able to get into the Recovery Console - so this is a good start. The first thing I did was backup all the data on the server with Robocopy to our external HDD (Our backups hadn't run in a couple of weeks). I'm able to get all the data, and I also backup the sysvol, exchange database and other items that seem important.
Next, I attempted to use various tools to fix the installation. I still haven't changed anything with the drives even though I have the new hardware on site. My goal at this point is to try to get Windows to boot, and run a system image backup that I can restore to a fresh array. I first attempt startrep, then I try chkdsk /f, then I sfc in offline mode, and finally a chkdsk /r. None of these get SBS to boot all the way, and I'm still stuck with the same boot characteristics.
So, I pull out drive 0 and replace it, since that drive was definitely dead. I bring up the BIOS and it shows that it is a NON-RAID disk with it's own virtual drive on it. I delete this virtual drive, which changes it's status to READY, at which point I added it as a Global Hot Spare. Now, the drive status has changed from above to Drive 0 - Spare, 1 - Online, 2 - Spare. Obviously you can't have a raid 5 set with 2 spares and 1 online drive. I'm guessing the PERC adapter freaked out when 2 drives failed, and set the remaining good drive to a spare. I unassigned drive 2 from being a spare, at which point I was pleasantly surprised to see it return to "online" status. Now it looks like I was getting somewhere. The status now is: drive 0 - spare, 1 - online, 2 online. Now I'm very unhappy to see that there is no "rebuild" functionality in the BIOS, but from reading online - the array should start rebuilding once Windows has booted and the software raid can do it's thing. This is a downer because I still can't get into Windows (Still the same characteristics). HOWEVER - when Windows is booting, I see it accessing all three drives, even the new one with nothing on it - which leads me to believe that the Windows boot is getting far enough to get the array to automatically start rebuilding. But - then it crashes and reboots.
Discouraged, I next boot into the recovery environment again. I notice that even while in the recovery environment it appears that the array is rebuilding from the light activity on the front of the drives. I decide to call it a night and let it do what it's doing.... This may be good!!
I wake up and I see that it appears the rebuild is done, but now drive 1 is blinking orange. I attempt to reboot and stop in the RAID bios screen to see what's up - It shows that 1 of my virtual drive (not C: though) is rebuilt to normal, but my C: drive is still degraded. I don't change anything and I allow it to continue booting, but again - same characteristics. During boot time drive 1 turns green and it appears that most of the boot is occurring from that drive from the lights, and midway through the boot that drive turns orange again. The system continues to have the same boot characteristics. I decide to pull disk 1 from the array and reboot. This time it boots partially and then BSOD's.
I place disk 1 back in the array and am back to the same situation.
1. Has anyone had this type of situation happen, and were you able to get out of it without rebuilding from scratch?
2. I am able to back up data in the recovery window. To help aid the rebuild of SBS if I can't get out of this, what should all be backed up to aid in a successful rebuilt? So far I have backed up all user data, Exchange Database, the Exchange folder from C:\program files, sysvol, and inetpub.
Thanks for reading this whole thing and I hope it makes sense.