What is causing my VMWare ESXi 4.1.0 crashes

Points of My Scenario:
1. I am admin of VMWare ESXi 4.1.0 servers
2. I have 2 servers of identical hardware model and specs: Model: HP ProLiant DL580 G7, CPU=Intel Xeon E7540, RAM=48GB
3. I have virtual machines configured identically for both ESXi hosts (one set of VMs is a backup for the other)
4. Only one ESXi host regularly crashes with a BSOD (blue screen of death) - which is actually purple/magenta in color.
5. I have attached the text of the crash as it appears on screen in a text file.
QUESTION: How do I resolve whatever is causing the crash? - see attached text file.
ESXi-4-1-0-Error.txt
waltforbesSenior IT SpecialistAsked:
Who is Participating?
 
Andrew Hancock (VMware vExpert / EE MVE^2)Connect With a Mentor VMware and Virtualization ConsultantCommented:
We call the Purple Screen, the PSOD - Purple Screen of Death.

Most crashes I've seen of ESXi 4.1 in my working history, are caused by un-supported drivers and memory errors.

Have you tested your servers for memory errors using HP Diagnostics or Memtest86+
http://www.memtest.org/

Is the fault random/intermittent?

Can you reproduce the fault?
0
 
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Have you also checked that all your firmwares are up to date, HP Server BIOS, Smart Array Controller, iLO, Network cards?

Could you give me some more information, VMs, Networking, Shared Storage Fibre Channel SAN, iSCSI SAN etc

Which server crashes, Production or Backup? Anything different between them?
0
 
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
The Exception 14 Page Fault may be caused by either a hardware or a software issue.
Source
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1020181
0
 
5g6tdcv4Commented:
Are you running broadcom NIC's? There is an update for esxi for this specific issue
0
 
waltforbesSenior IT SpecialistAuthor Commented:
Hanccocka: I ran HP Diagnostics and caught the problems: memory temperature and processor temperature failures. I am fully satisfied that this case is resolved.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.