HP MSA 30 Logical Drive Failure

wpiitm
wpiitm used Ask the Experts™
on
I accidentally pulled the plug on this array device this morning when I was reconnecting some other devices to a UPS. I turned it back on and received error messages that the logical drives have failed. I ran the array diagnostic utility and this is what I got.
report.txt
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Top Expert 2014

Commented:
I think you just have to power-cycle the server it's plugged into, or have you done that?

Although it shows all the logical disks as failed (which you would expect if they suddenly disappeared from the bus) it doesn't show any serious problems with the hard disks. It even lists the RIS metadata one the MSA30 drives as being correct.

Author

Commented:
I have power cycled the server a couple times with no luck. Do I need to leave it off for an extended period of time? I even reseated all of the drives and cables before powering back on. It has our exchange and accounting software data. We do not have recent backups, so I am a little worried!

Commented:
I saw similar on my MSA20 a few years ago.
Try powering off the MSA Shut the server down, power on MSA and then after it is completely up (no flashing LEDs) power on server.  Watch bootup to see if it discovers the volume.
And any errors in the bootup.
Top Expert 2014

Commented:
Hmm, all the disks in the MSA have this:
Last Failure Reason: 0x20 (RIS data could not be saved to the drive)
Don't see how that would matter though.

No disks in bays 6 and 7, is that correct?

Author

Commented:
On bootup it says something about drive failure and strike F1 to continue with drives disabled or F2 to re-enable drives.

There are disks in all of the bays.
Top Expert 2014

Commented:
Report shows drives 6 and 7 not there, so that means a double disk fault. I would power off, reseat those disks and power on again. Hopefully at least one of them spins up for long enough to back it up.

Do not re-enable the logical drives until there's at least one of those two drive seen in the ADU report.

Here's the area of the report I can see those two disks missing so you can just search it for "RIS Copy 0: All RIS bytes zero (drive spare or unused)" and look a few lines above that.


SCSI Port 1, Drive ID 1:  
      RIS drive: 0x1
      RIS Copy 0:  Same as above.
      RIS Copy 1:  Same as above.
   SCSI Port 1, Drive ID 2:  
      RIS drive: 0x2
      RIS Copy 0:  Same as above.
      RIS Copy 1:  Same as above.
   SCSI Port 1, Drive ID 3:  
      RIS drive: 0x3
      RIS Copy 0:  Same as above.
      RIS Copy 1:  Same as above.
   SCSI Port 1, Drive ID 4:  
      RIS drive: 0x4
      RIS Copy 0:  Same as above.
      RIS Copy 1:  Same as above.
   SCSI Port 1, Drive ID 5:  
      RIS drive: 0x5
      RIS Copy 0:  Same as above.
      RIS Copy 1:  Same as above.
   SCSI Port 1, Drive ID 6:  Physical drive not connected.
   SCSI Port 1, Drive ID 7:  Physical drive not connected.
   SCSI Port 1, Drive ID 8:  
      RIS drive: 0x8
      RIS Copy 0:  Same as above.
      RIS Copy 1:  Same as above.
   SCSI Port 1, Drive ID 9:  
      RIS drive: 0x9
      RIS Copy 0:  Same as above.
      RIS Copy 1:  Same as above.
   SCSI Port 1, Drive ID 10:  
      RIS drive: 0xa
      RIS Copy 0:  Same as above.
      RIS Copy 1:  Same as above.
   SCSI Port 1, Drive ID 11:  
      RIS drive: 0xb
      RIS Copy 0:  Same as above.
      RIS Copy 1:  Same as above.
   SCSI Port 1, Drive ID 12:  
      RIS drive: 0xc
      RIS Copy 0:  Same as above.
      RIS Copy 1:  Same as above.
   SCSI Port 1, Drive ID 13:  
      RIS drive: 0xd
      RIS Copy 0:  Same as above.
      RIS Copy 1:  Same as above.
   SCSI Port 1, Drive ID 14:  
      RIS drive: 0xe
      RIS Copy 0:  Same as above.
      RIS Copy 1:  Same as above.
   SCSI Port 1, Drive ID 15:  
      RIS drive: 0x0
      RIS Copy 0: All RIS bytes zero (drive spare or unused).
      RIS Copy 1: All RIS bytes zero (drive spare or unused).

Author

Commented:
You are correct, no drives in 6 & 7.

Would re-enabling cause me to lose all data on those disks, like it warns?
Top Expert 2014
Commented:
I'd give you 97% on pressing F2 to re-enable the logical drives working successfully since you don't have any missing disks after all. The 3% doubt factor is that there is junk in the controller cache that would get flushed to the wrong disk or important stuff in it that doesn't get flushed.

Taking each disk out of the cage and making an image of it then de-striping it would give 99.5% chance of recovery but that would take a couple of days of hard work plus sofware costs.

Ultimately it's your call, if the data's worth less than about $5K I'd say press F2, if it's worth more I'd call in a forensic engineer.

Whatever you press it won't lose all the data on those disks, the warning's over the top panic stuff. At worst it will write a little bit of junk on top of the metadata and thus confuse the controller making expensive software based recovery needed. To lose all the data it has to spend several hours writing a secure erase patern on the disks and it doesn't even have that ability.

Let's hope the cache battery stands up, don't leave the server powered off while you think about the options, leave it plugged in to avoid the cache going stale or you might get the odd unexpected corruption from unflushed data.

Author

Commented:
I talked with HP support and they confirmed that re-enabling the disks would not cause a loss of data. They advised that I update the firmware and see if that fixes it first. It did not, so I re-enabled and reboot and everything came back up. i am getting a warning that two of the disks are close to failing. I made a backup immediately, just in case. Thanks for the help!
Top Expert 2014

Commented:
Yes, drives 0 and 2 have prefailure alerts, sometimes there's new firmware rather than replacing the disks if they realise the performance parameters weren't set quite right but yours look pretty much up to date (I haven't checked all of them).

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial