Solved

MSA 1000 - 2 Cabinets ... Disk Failure ...

Posted on 2006-07-10
13
936 Views
Last Modified: 2013-11-15
I came in from the weekend and found the following event log on a few of my servers.

Physical Drive in Box 1, Bay 12 of the Array Controller \Device\FibreArray1, HBA Slot 4, Chassis: 9J3xxxxxxxxx, has failed. Failure Code: 0x30.

Immediately following this entry are three more ...

Logical Drive 1 of Array Controller \Device\FibreArray1, HBA Slot 4, Chassis: 9J3xxxxxxxxx, has changed status from OK to INTERIM RECOVERY.

Logical Drive 2 of Array Controller \Device\FibreArray1, HBA Slot 4, Chassis: 9J3xxxxxxxxx, has changed status from OK to INTERIM RECOVERY.

Logical Drive 3 of Array Controller \Device\FibreArray1, HBA Slot 4, Chassis: 9J3xxxxxxxxx, has changed status from OK to INTERIM RECOVERY.

I'm assuming that this is telling me that one of my drives in the array went bad.  However when I go look at the cabinet. I don't see amber lights on drive 12. The controller does show INTERIM RECOVERY on the LCD panel. How should I proceed to make sure that my disks are in good condition?
0
Comment
Question by:Chadwhite
13 Comments
 
LVL 88

Accepted Solution

by:
rindi earned 84 total points
ID: 17072835
It looks as if this was a temporary disk problem, but the "Interim Recovery" has fixed the problem again. If the problem shows up again on the drive in HBA slot 4 it may help to replace that HD. If on the other hand you get the same problem on another slot it may help to replace the cables or the array controller, or update the firmware of the controller.
0
 
LVL 55

Assisted Solution

by:andyalder
andyalder earned 83 total points
ID: 17073377
You'd think "Interim recovery" would mean it's rebuilding but it doesn't, it means it's running with a disk down. Run the array diagnostic utility on a server connected to the storage and see what it says, also run the Array Configuration Utility and both programs will give you more information.
0
 
LVL 3

Author Comment

by:Chadwhite
ID: 17074144
I ran ADU and I think I see the drive in slot 12 showing no errors logged. But its a bit confusing (HEX) and very extensive (long) I ran ACU and it's easier to isolate drive 12 in box 1 but it shows up as OK.  Am I missing something. Anyone have any pointers for interpreting the ADU information?

Thanks!
0
Simplifying Server Workload Migrations

This use case outlines the migration challenges that organizations face and how the Acronis AnyData Engine supports physical-to-physical (P2P), physical-to-virtual (P2V), virtual to physical (V2P), and cross-virtual (V2V) migration scenarios to address these challenges.

 
LVL 55

Expert Comment

by:andyalder
ID: 17075358
Is Interim Recovery the last thing on the MSA log? I'm not sure what happens if someone uses the up button and scrolls past an error whether that gets in the event log or not so make sure the MSA controller is at the last message.
0
 
LVL 15

Assisted Solution

by:mcp_jon
mcp_jon earned 83 total points
ID: 17091255
Try to Update the Hp ACU Utility, and check again !

Maybe something went corrupt and is giving that odd distortion. ( I've seen it happen ).

Best Regards !
0
 
LVL 20

Expert Comment

by:brwwiggins
ID: 17091269
check the firmware on your MSA. I had a similar problem where my MSA reported a drive being bad on a few servers (but not all). HP recommended I update the firmware (which is usually their canned solution) but it worked for me in this case.
0
 
LVL 15

Expert Comment

by:mcp_jon
ID: 17091276
0
 
LVL 15

Expert Comment

by:mcp_jon
ID: 17091292
For Windows, it's the ACU GUI, Graphic User Interface, Version 7.50.23.0, dated 13 Apr 06 .

Best Regards !
0
 
LVL 15

Expert Comment

by:mcp_jon
ID: 17599283
I'd suggest a Split !

Best Regards !
0
 
LVL 15

Expert Comment

by:mcp_jon
ID: 17649814
Fine by me !

Best Regards !
0

Featured Post

Three Reasons Why Backup is Strategic

Backup is strategic to your business because your data is strategic to your business. Without backup, your business will fail. This white paper explains why it is vital for you to design and immediately execute a backup strategy to protect 100 percent of your data.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

How to update Firmware and Bios in Dell Equalogic PS6000 Arrays and Hard Disks firmware update.
Finding original email is quite difficult due to their duplicates. From this article, you will come to know why multiple duplicates of same emails appear and how to delete duplicate emails from Outlook securely and instantly while vital emails remai…
This tutorial will walk an individual through locating and launching the BEUtility application to properly change the service account username and\or password in situation where it may be necessary or where the password has been inadvertently change…
This tutorial will show how to configure a single USB drive with a separate folder for each day of the week. This will allow each of the backups to be kept separate preventing the previous day’s backup from being overwritten. The USB drive must be s…

786 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question