RAID 1 both drives listed as global hot spares

I have a Dell T410 server with Perc s300 raid using RAID 1.  There was a failure of one drive and the other is degraded.  I replaced the failed drive because I only had one spare available.  In the boot utility (Ctrl-R) I configured the new drive as a hot spare.  When I rebooted and hit Ctrl-R again I see that the Virtual disk is listed as degraded but both drives are listed as hot spares.

I can still boot to Windows Server 2008 R2 and in Open Manage Server Administrator the Virtual Disk name says NONE and status says Failed.  The only task available is Delete.

Under Physical Disks 0:0:0 has a state of Degraded with failure predicted YES and 0:0:1 has a state of Online with failure predicted NO.  For both the only tasks available are Blink, Unblink and Unassign Global Hot Spare.

The system is running but there is a bad sector in the middle of my SQL database.  I cannot perform a successful backup using the installed Shadow Protect nor can I get a successful database backup using SQL Server.

Yikes - I've ordered 2 additional drives.  How can I proceed with having Server Administrator recognize my the Virtual Disk so I can initiate a rebuild.  It's unclear whether or not the 2nd physical disk is actually functioning as a hot spare.  I did try booting with only the second drive and that failed.  I'm concerned that if drive 0 fails I'm SOL.

I'm looking for very specific advice applicable to this situation and the details described.  Please no woulda, coulda, shoulda comments or general advice that doesn't address the issues that are described.
Thanks
Mark SavastanoAsked:
Who is Participating?
 
Mark SavastanoConnect With a Mentor Author Commented:
An Update to my situation.

Per my description, Drive 1 is failed and Drive 0 is degraded due to bad sectors.  Installation of replacement hot spare for Drive 1 did not allow for array rebuild because of degraded state of Drive 0.  Rebuild is attempted and then it fails.  Current attempts at backup also fail due to the drive errors.  Attempts to copy some files also fail for the same reason.  I know this is an unusual problem, especially that two drives in the array fail and that's why I posted here looking for possible suggestions.

The quickest solution was to go to backup and restore bare metal to different hardware to avoid the problems associated with the PERC S300.  There was only a small amount of data loss.  Case Closed.  I'll be migrating any other customers who have this configuration to new hardware as well.
0
 
yo_beeDirector of Information TechnologyCommented:
If you have a enterprise level RAID controller card you should not have to play around with any of your RAID configurations.    I am not 100% sure, but from my experience you pull the failed drive out and replace with the new drive (Hot Swappable).  Once the RAID controller see the new healthy drive the rebuild starts.  I am not sure if your change in the Boot Utility caused the issue that you are seeing, but I have to suspect so with what I know about RAID.

The question is how did you find out you had a failed drive as well as a degrading one?   It is not common to see two drives fail or degrade at the same time.
0
 
Mark SavastanoAuthor Commented:
It sounds like you're not familiar with this type of controller.  Yes, they did both fail at the same time for unknown reasons.  I am seeking suggestions from someone who has experience with the DELL software and hardware that I mentioned.  Thanks.
0
Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
Seth SimmonsSr. Systems AdministratorCommented:
hope you have a good recent backup if you can't do one now
the fact that the other drive shows predicted failure will prevent you from rebuilding
you really don't have any options here besides replacing the hardware, creating a new array and restoring (or rebuilding) the server
the loss of one drive leaves a single point of failure; the other one is near failure so you're in a bad position you can't easily recover from
normally i would recommend raid 6 or 10 for performance and better fault tolerance when running things like sql or exchange but that controller doesn't support either of them
0
 
Dariusz TykaICT Infrastructure Specialist Senior Commented:
In your case I would make server backup immediately. You can use free version of Veeam Agent for Windows. Also create a recovery media so when you have recreated raid on new drives you can boot server from recovery media and restore whole server from backup.
https://www.veeam.com/windows-cloud-server-backup-agent.html
0
 
andyalderCommented:
Seen several similar problems with the fakeRAID S100 and S300 giving completely impossible sizes to virtual disks and incorrect status. Best for the future is to replace it with a proper PERC controller such as H700.

As it's only RAID1 you could take one disk off the S300 and connect it direct to a SAS/SATA HBA and recover the data as it doesn't need to be de-striped. There's a multitude of programs that could then be run to try to "repair" it or recover files.

Unfortunate about the bad sector on the SQL database, you can run chkdsk /R on it but that will only "fix" it by replacing the bad sector with zeros as far as the file contents are concerned. You can then run SQL DBCC on the database but even that can't recreate the missing block. DBCC may tell you which records are affected though so at least you could then manually examine those records,
0
 
Mark SavastanoAuthor Commented:
Thanks for that suggestion but I have a feeling that won't work since windows is looking for the S300 controller.  I'll need to do a search on booting the T410 from the sata controller, I know it's not a simple process.  I saw a suggestion on spiceworks to unassign the hot spare designation in the controller bios.  I was going to give that a go also.  Waiting for my replacement drives to arrive tomorrow morning.
0
 
andyalderCommented:
Replacing with a decent controller was a suggestion for a future rebuild, not to fix this. You can't migrate directly from S300 to H700 anyway, https://www.dell.com/community/PowerEdge-HDD-SCSI-RAID/Upgrading-T310-from-S300-to-H700i/m-p/4262967 describes the procedure but note the start of the second paragraph, they've seen even more absurd behaviour with the S300.

Similarly connecting the disk to a SAS/SATA HBA to pick files off with a recovery tool would be on a recovery PC that was booted from a different source.
0
 
andyalderCommented:
So you;re going to do what I suggested and get rid of the S300 in the future?
0
 
Mark SavastanoAuthor Commented:
No other suggestions contributed to this conclusion or were helpful in any way.
0
 
andyalderCommented:
Glad I could help.
0
 
Mark SavastanoAuthor Commented:
Sorry but your comments had no bearing on my decision.  I had already replaced the server and restored from a backup.  My only concern with respect to this post was regaining access to the hardware for the purposes of recovering the database.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.