aevegan
asked on
RAID 1+0 array volume inaccessible with 1 failed disk
I have an odd situation. I have an hp dl380 g5 with an 8 disk 1+0 RAID array on a p400 controller. The machine was remotely shit down for maintenance and as I walked over to the rack I noticed one hdd failure light was indicating on bay 2 just prior to the power off completion. When I powered the machine back on it wouldn't find the storage volume in esx (our esx system files reside in flash storage so it still boots fine). I restarted the machine again and popped I the ACU disk to check things out and noticed that there were warnings that the logical drive was in automated recovery mode and the disk in bay 2 was as well. Physically I noticed the lights on bay 2 and 6 slowly flashing. There was also a warning for Imminent failure in bay 4. If I let the recovery run it comes back with an error that disk 2 is dead. I don't have the replacement in my possession yet but I suppose the volume should still be accessible with one dead disk. Any ideas what may be causing it to be unavailable? I will put the replacement drive into bay 2 as soon as I receive it and will wait to replace bay 4 until hopefully bay 2 rebuilds. But I am suspicious as to why I cannot gain access to the volume right now. Thanks.
Sometimes dead disks block all I/O on drive controller. Try removing dead disk and see if ACU detects your volumes.
Reboot and watch during POST for a message from the controller saying it will enable/disable the logical disk. This is something on the lines of 'Select "F2" to accept data loss and to re-enable logical drive(s)' , default F1 disables the logical disks.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Everything worked out in the end. Not really sure why it went haywire but its been stable for a long time now.
ASKER
Self Resolved
ACU has some diagnostic info. Run it and look at health & configuration info. If the data isn't available now, and it did, or is doing a rebuild, then unfortunately you have 100% data loss.
Since the array is rebuilding then accessibility isn't going to change when the run completes.
Another possibility is that the drive ordering was changed. What was once the D disk is now the F:\ drive. Post event logs from ACU