We have a super weird thing happening. We have two Dell R730 servers, both with Perc H810 controllers in them. The two servers are connected to an IBM V3700 and a Dell MD1220.
Last week, we rebooted the first host and at the boot, we got this error:
"LSI-EFI SAS Driver:
Unhealthy status reported by this UEFI driver without specific error
UEFI0116: One or more boot drivers have reported issues
Check the Driver Health Menu in the Boot Manager for details.
One or more boot drivers require configuration changes. Press any key to load the driver health manager for configurations."
So, we got that after the reboot and after some googling, we decided to move the Perc H810 out of the second server into the first and it booted right up, no issues. So, we chalked it up to a bad card, ordered a replacement, put it into the first server and it booted just fine as well. There you have it, we had a bad Perc card.
Now, fast forward a week. Server 1 has it's new Perc and it's running great, seeing all of the storage and we are happy.
Server 2 is running great with it's Perc card that we stole from server 1 last week and life is good.
Now, tonight, I need to reboot server 2 and boom, it hangs and is now reporting the exact same error as noted above.
Pressing any key does nothing. The machine will not boot into anything...iDraq doesn't even kick work when it's in this state. All we can do is remove the Perc card and it boots normally.
It "could be" that we lost two Perc H810 cards, but it's too coincidental to me. It seems something else is amiss and I can't put my finger on it.
Anyone else know what's going on here?
By the way, we migrated all of the active VMs to the first server with it's new Perc H810 and we plan on replacing the H810 in server 2 that has now seemed to gone kaput!