Solved

VMWare esxi 5.5 cluster hosts keep crashing

Posted on 2014-10-28
9
1,087 Views
Last Modified: 2014-11-10
We have a dual esxi 5.5 cluster with 2 IBM x3530 M4's ( 2x Xeon E5-2450 and 96 gig memory) connected to an IBM v3700 DAS. Both hosts are running esxi 5.5.0 2143827. They are in a fail over cluster with one server running all 13 VM's and the other just waiting. They run fine for about a month then start failing with errors "Memory, Group 4 CPUs: Bus Uncorrectable error, Group 1 One of the DIMMS 0: Uncorrectable ECC". IBM has been out to replace the system board, both CPU's, all the memory DIMMS and back plain. Also both servers are running esxi from an IBM commercial graded USB drive connected to hypervisor port on system board which has also been replaced. Since same error messages happen on both servers, I am starting to think it is a vmware issue? Has anyone ran into this kind of issue, and if so what was your fix?
0
Comment
Question by:jvillareal78
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 4
9 Comments
 
LVL 121
ID: 40409051
Are you servers on the HCL ?

I assume all the servers are on the latest firmware ?

This looks like a hardware issue, and we've seen this before with Dell, IBM and HP.
0
 

Author Comment

by:jvillareal78
ID: 40409088
This setup was built by the IBM persons at CDW. This is what they recommended for our VMWare environment. When IBM came out multiple time, they did upgrade the firmwares on both machines.
0
 
LVL 121
ID: 40409170
I would bounce it back to IBM, with STILL FAULTY!
0
Enterprise Mobility and BYOD For Dummies

Like “For Dummies” books, you can read this in whatever order you choose and learn about mobility and BYOD; and how to put a competitive mobile infrastructure in place. Developed for SMBs and large enterprises alike, you will find helpful use cases, planning, and implementation.

 

Author Comment

by:jvillareal78
ID: 40409214
I just can not see the exact same issues on both servers being a hardware issue. The only thing is VMWare esxi 5.5 on both.
0
 
LVL 121
ID: 40409248
Do the hosts crash with a PSOD ?

Has the memory been tested using memtest86+ ?

Both are on the HCL for 5.5 U2.

I would escalate to VMware and IBM! VMware are likely to through it back to IBM!

Have you tried 5.1.
0
 
LVL 55

Expert Comment

by:andyalder
ID: 40410921
Reboot an press F2 for diags and look at the BMC log. You could also slow the RAM down in BIOS which is a bit of a bodge but may work if it's a timing issue.
0
 

Author Comment

by:jvillareal78
ID: 40417338
Went into BIOS and did not see any timing settings that I could change. I did change to non-numa and so far hasnt gone down.
0
 
LVL 121

Accepted Solution

by:
Andrew Hancock (VMware vExpert / EE MVE^2) earned 500 total points
ID: 40417358
Are you using the correct memory for NUMA and two processors, and are your CPUs balanced correctly with the correct memory?
0
 

Author Closing Comment

by:jvillareal78
ID: 40433255
When NUMA turned off have not had an issue with either server. Issue seems to have been a BIOS setting.
0

Featured Post

Forrester Webinar: xMatters Delivers 261% ROI

Guest speaker Dean Davison, Forrester Principal Consultant, explains how a Fortune 500 communication company using xMatters found these results: Achieved a 261% ROI, Experienced $753,280 in net present value benefits over 3 years and Reduced MTTR by 91% for tier 1 incidents.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

If we need to check who deleted a Virtual Machine from our vCenter. Looking this task in logs can be painful and spend lot of time, so the best way to check this is in the vCenter DB. Just connect to vCenter DB(default DB should be VCDB and using…
When converting a physical machine to a virtual machine using VMware vCenter Converter Standalone or vCenter Converter Enterprise, if an adapter type is not selected during the initial customization the resulting virtual machine may contain an IDE d…
Teach the user how to convert virtaul disk file formats and how to rename virtual machine files on datastores. Open vSphere Web Client: Review VM disk settings: Migrate VM to new datastore with a thick provisioned (lazy zeroed) disk format: Rename a…
Teach the user how to join ESXi hosts to Active Directory domains Open vSphere Client: Join ESXi host to AD domain: Verify ESXi computer account in AD: Configure permissions for domain user in ESXi: Test domain user login to ESXi host:

734 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question