VMWare esxi 5.5 cluster hosts keep crashing
Posted on 2014-10-28
We have a dual esxi 5.5 cluster with 2 IBM x3530 M4's ( 2x Xeon E5-2450 and 96 gig memory) connected to an IBM v3700 DAS. Both hosts are running esxi 5.5.0 2143827. They are in a fail over cluster with one server running all 13 VM's and the other just waiting. They run fine for about a month then start failing with errors "Memory, Group 4 CPUs: Bus Uncorrectable error, Group 1 One of the DIMMS 0: Uncorrectable ECC". IBM has been out to replace the system board, both CPU's, all the memory DIMMS and back plain. Also both servers are running esxi from an IBM commercial graded USB drive connected to hypervisor port on system board which has also been replaced. Since same error messages happen on both servers, I am starting to think it is a vmware issue? Has anyone ran into this kind of issue, and if so what was your fix?