Link to home
Start Free TrialLog in
Avatar of carlino70
carlino70Flag for Argentina

asked on

Failure in controller of HP StorageWorks MSA2012fc

Hi experts, I have the next critical error on MSA logs (attached):
TUE OCT 07 11:40:55 2014 [314] #A16253: MSA2012fc Array SN#00C0FFD839E1 Controller A CRITICAL: FRU type: RAID IOM B, problem: encl 0. Product ID: AJ744A, S/N: 3CL922S426 rev: H.  Related event ID: 10016252, type: 313

Open in new window

The array connects to an Oracle RAC server with two instances
The AlertLog of Oracle Instance #2 says:
Errors in file /cots/oracle/app/oracle/admin/xa21/bdump/xa212_j001_30215.trc:
ORA-27091: unable to queue I/O
ORA-27072: File I/O error
Linux-x86_64 Error: 5: Input/output error
Additional information: 4
Additional information: 1529712
Additional information: -1

Open in new window

The Oracle instance #1 does not report errors.
The version of S.O is:
uname -a
Linux 2.6.18-53.1.21.el5 #1 SMP Wed May 7 08:42:34 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux

Open in new window

and the versions in MSA:
# versions
Controller A Versions
---------------------
Storage Controller CPU Type   : Celeron 566MHz
Storage Controller Firmware   : J200P30
Storage Controller Memory     : F300R22
Storage Controller Loader     : 15.010
Management Controller Firmware: W420R52
Management Controller Loader  : 12.013
Expander Controller Firmware  : 3022
CPLD Revision                 : 27
Hardware Revision             : LCA 56
Host Interface Module         : 26
Host Interface Module Model   : 1

Controller B Versions
---------------------
Storage Controller CPU Type   : Celeron 566MHz
Storage Controller Firmware   : J200P30
Storage Controller Memory     : F300R22
Storage Controller Loader     : 15.010
Management Controller Firmware: W420R52
Management Controller Loader  : 12.013
Expander Controller Firmware  : 3022
CPLD Revision                 : 27
Hardware Revision             : LCA 56
Host Interface Module         : 26
Host Interface Module Model   : 1

Open in new window

Could you please suggest me the steps to find the problem and its resolution?
Thankyou
Regards
logs-from-20140804.log
SOLUTION
Avatar of Paul Solovyovsky
Paul Solovyovsky
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Stampel
Stampel

When you look at your MSA, do you have one or two controllers ?
Don't you see any orange light / LED message on it ?
Avatar of carlino70

ASKER

paulsolov, the warranty expired. I searched for similar cases in the HP forums with various attempted solutions, with varying results.
Stampel, the MSA has 2 controllers.
No colors orange light / display in the Led's
Currently the leds 'LINK' of each input FO are off, on controller #1. But I see information from A y B, with up/down link, permanently:
TUE OCT 14 14:28:09 2014 [111] #A16361: MSA2012fc Array SN#00C0FFD839E1 Controller A INFORMATIONAL: Host link up Chan1: 4 Loop IDs, External Device(s)
TUE OCT 14 14:28:08 2014 [112] #A16360: MSA2012fc Array SN#00C0FFD839E1 Controller A WARNING: Host link down Chan1
TUE OCT 14 14:28:08 2014 [111] #A16359: MSA2012fc Array SN#00C0FFD839E1 Controller A INFORMATIONAL: Host link up Chan0: 4 Loop IDs, External Device(s)
TUE OCT 14 14:28:08 2014 [112] #A16358: MSA2012fc Array SN#00C0FFD839E1 Controller A WARNING: Host link down Chan0
TUE OCT 14 14:28:08 2014 [111] #B10072: MSA2012fc Array SN#00C0FFD839E1 Controller B INFORMATIONAL: Host link up Chan0: 4 Loop IDs, External Device(s)
TUE OCT 14 14:28:08 2014 [112] #B10071: MSA2012fc Array SN#00C0FFD839E1 Controller B WARNING: Host link down Chan0
TUE OCT 14 14:28:08 2014 [111] #B10070: MSA2012fc Array SN#00C0FFD839E1 Controller B INFORMATIONAL: Host link up Chan1: 4 Loop IDs, External Device(s)
TUE OCT 14 14:28:08 2014 [112] #B10069: MSA2012fc Array SN#00C0FFD839E1 Controller B WARNING: Host link down Chan1

Open in new window

ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
not yet. I wanted to get an accurate diagnosis before
The real important message i can see is from your logs is :
"CRITICAL: RAID controller B failed, reason PCIE link recovery failed" + Failover ...
MSA2012fc and MSA2000 have had many problems, i would upgrade firmware first just in case.

Do you still have File I/O error on Oracle RAC ?
Stampel, I/O Errors no longer appear.
But the controller is working like simple configuration.
I will be doing a firmware upgrade, and evenual replacement controller.
Thanks for your comments.
Regards.