Odd RAID behaviour on HP DL380 G6 with multiple controllers

This is really starting to do my head in.
My DL380 G6 has 3x RAID controllers.
  • 1x P410i with 256MB cache (no battery).
  • 1x P212 with 256MB cache (no battery).
  • 1x P800 with 512 BBWC

The P410i came with firmware version 5.70
P212 with v 3.xx
P800 is still v7.xx

The P410i links to the main cage in the server, the P212 to a generic SAS/SATA drive unit and the P800 to a MSA60.

The main reason for the firmware updates on the P410i and P212 was the P212 couldn't see 3TB+ disks.
Since the upgrade, the P410i has 0x14 lockups (1719-Slot 0 Drive Array - A controller failure event occurred prior to this power-up. (Previous lock up code = 0x14)) in ESXi, however seems fine in SmartStart and a Linux live boot (which I used to update the firmware from).
ESXi starts up and loads fine when the P410i has it's drives out, obviously not loading the VMs on it. (1x 4x146GB SAS as RAID 10, 1x 4x500GB SATA as RAID10).

In desperation I took the server home to try to at least recover the data off and on my workbench (with the P212 and P800 connections off) it booted fine. I began a very cumbersome backup overnight with SCP and at 500KBps it didn't get very far.
As it ran well, I took it back and plugged it in at the datacentre and I'm back to square 1.

Does anyone have any ideas what I can do or what to try? Since ESXi tries to access the datastores off P410i it jams a SSH session when trying to access the datastore and I have to physically reset the server to re-access it as ESXi won't shutdown/reboot.

I've even disabled the Array Accleratior options to see if that helps. Datastores on the P410i are VMFS6 and P800 are VMFS5 (due to GPT vs MBR compatibilities).
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
what version of ESXi is this, and is it compatible with your server, e.g. on the HCL ?

I also recall multiple controllers in ESXi giving issues.

Also your HP server, HP DL380G6 is not certified to work with ESXi 6.x, the last version it is certfiied for was 5.5 U3.


Have you tried using 5.5 U3.

Do you have VMs you need to recover from the VMFS5 and VMFS6 datastores ?
kiwistagAuthor Commented:
ESXi 6.5 U1
This time I have removed the Cache RAM from the controller and have access. I can now back up to a NAS unit which is 20mbps - something thankfully.
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Your HP DL380 G6  Server is NOT SUPPORTED

So your mileage may vary in terms of what works!

But there are many issues with the driver and Smart Array controllers if you look at HPE advisories and that's on supported servers.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
10 Tips to Protect Your Business from Ransomware

Did you know that ransomware is the most widespread, destructive malware in the world today? It accounts for 39% of all security breaches, with ransomware gangsters projected to make $11.5B in profits from online extortion by 2019.

kiwistagAuthor Commented:
Looks like it could be the cache module. After removing it (working fine), adding it (failure) and removing it again (fine), it could be an issue.

I do have other later-gen servers coming but it's just a matter of time for them to arrive.
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
I wonder if your cache module is faulty under a different OS?
kiwistagAuthor Commented:
Hrm. Seems the P212 is doing it also. I've deleted all arrays on the P212 (as it's not needed) and testing further.
Just annoying that with firmware 5.70 the P410 worked perfectly!
kiwistagAuthor Commented:
Actually, I think it might be quite obvious but I never considered it originally and feedback has helped.. I suspect that one of the cache modules is faulty and swapping it round has caused this mess. I'll close the question now but thanks for you help.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.