Solved

ESXi PSoD Exception 14

Posted on 2014-10-25
9
640 Views
Last Modified: 2014-10-30
Hi All,

Can anyone here please assist me in troubleshooting the problem in random PSoD that affects one of my HP Blade server running ESXi 5.1 as per below screenshot:

PSoD
I'm not sure what else to do to begin troubleshooting this problem ?

Thanks
0
Comment
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 4
9 Comments
 
LVL 120

Accepted Solution

by:
Andrew Hancock (VMware vExpert / EE MVE^2) earned 500 total points
ID: 40403925
Most PSODs are caused by a hardware issue.

This could be non compatible hardware, which is not on the HCL.

Check the VMware Hardware Compatability Lists HCL here

The VMware Hardware Compatibility List is the detailed lists showing actual vendor devices that are either physically tested or are similar to the devices tested by VMware or VMware partners. Items on the list are tested with VMware products and are known to operate correctly.Devices which are not on the list may function, but will not be supported by VMware.

http://www.vmware.com/go/hcl

So first checks...

1. What is the HP Blade Server, that's rather generic ? Is your hardware on the HCL ?

2. is your hardware up to date with Firmware, for BIOS, Storage, Network Controllers ?

3. Are you using the OEM HP version of ESXi 5.1 ?

4. Have you checked the memory is seated correctly?

5. Have you checked fans, CPU heatsinks, and fans?

6. Have you tested memory using memtest86+

7. If you have a support contract with HP, log a support request.

8. If you have a support contract with VMware, log a support request.

9. Random faults are difficult to track down.....how many VMs were running at time of crash?

10. Look back at your change database, what changes have been made to the server and environment.

11. Do you have a syslog server, or persistant storage of logs, so you can check back and have a look at the logs /var/logs/vmkernel.log, to check for any errors before the PSOD ?

12. Build version of ESXi - is it the latest?

13. Track down the World ID and the VM?

14. Is that VM OS supported for ESXi 5.1 ?

15. Is the network card in the VM VMXNET3 or E1000, there have been issues with builds of ESXi and VM nics, causing PSOD, e.g. bug in ESXi!

16. Supported CPU microcode, and are both CPUs the same.

17. Memory installed in correct banks.

18. Certified memory installed.

These are the troubleshooting steps you need to start performing.

There is not really a simple answer, of AH the PSOD is caused by that!

We've had issues where servers have been stable for years, and when we started to load them, and more VMs were on them, they used more memory, and we had a memory fault at the TOP of RAM on a server, at 496GB ish, and when the server was heavily loaded with VMs, and used that "memory module" the server would PSOD!
0
 
LVL 8

Author Comment

by:Senior IT System Engineer
ID: 40403927
ok, so in this case what log should I gather and analyze for the root cause analysis ?
0
 
LVL 120
ID: 40404043
I've listed the log in my post!

It may not reveal anything, but it worth a look, I'm also waiting for answers to the questions in my post.
0
What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

 
LVL 8

Author Comment

by:Senior IT System Engineer
ID: 40405803
SOmehow when I log the case to HP, they recommends me to update the iLO v4 firmware from the existing v1.4.0 into v2.02 (http://h20566.www2.hp.com/portal/site/hpsc/template.PAGE/public/psi/swdDetails/?sp4ts.oid=5228286&spf_p.tpst=swdMain&spf_p.prp_swdMain=wsrp-navigationalState%3Didx%253D2%257CswItem%253DMTX_8372c55483b9432abd53d91951%257CswEnvOID%253D4115%257CitemLocale%253D%257CswLang%253D%257Cmode%253D4%257Caction%253DdriverDocument&javax.portlet.begCacheTok=com.vignette.cachetoken&javax.portlet.endCacheTok=com.vignette.cachetoken)

since this is a well known issue according to them... that's rather strange, because how come I can see the one particular VMname in there not the ESXi host ?
0
 
LVL 120

Assisted Solution

by:Andrew Hancock (VMware vExpert / EE MVE^2)
Andrew Hancock (VMware vExpert / EE MVE^2) earned 500 total points
ID: 40405812
This was Bullet Point 2 in my post, Update and Check Firmware!
1
 
LVL 8

Author Closing Comment

by:Senior IT System Engineer
ID: 40414816
Thanks !
0
 
LVL 8

Author Comment

by:Senior IT System Engineer
ID: 40414818
So in this case why the PSOD shows the VM name ? not the actual host name.

is there something happened caused by that particular VM ?
0
 
LVL 120
ID: 40414824
It's possible we've seen VMs running unsupported OS, or network interfaces, or using defective memory cause PSODs.

Is it always this vm?
0
 
LVL 8

Author Comment

by:Senior IT System Engineer
ID: 40415050
No it is not always. but just curious as to why that VM name is displayed on the PSoD.

next time when the crashed happened i'll get some more information and post it in here.

My manager doesn't like the idea of upgrading the firmware for all of the Blade components for the time being, unless it is a must to upgrade from ESXi 5.1u1 into ESXi 5.5 and above.
0

Featured Post

Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article outlines why you need to choose a backup solution that protects your entire environment – including your VMware ESXi and Microsoft Hyper-V virtualization hosts – not just your virtual machines.
Giving access to ESXi shell console is always an issue for IT departments to other Teams, or Projects. We need to find a way so that teams can use ESXTOP for their POCs, or tests without giving them the access to ESXi host shell console with a root …
Teach the user how to rename, unmount, delete and upgrade VMFS datastores. Open vSphere Web Client: Rename VMFS and NFS datastores: Upgrade VMFS-3 volume to VMFS-5: Unmount VMFS datastore: Delete a VMFS datastore:
Teach the user how to use configure the vCenter Server storage filters Open vSphere Web Client:  Navigate to vCenter Server Advanced Settings: Add the four vCenter Server storage filters: Review the advanced settings: Modify the values of the four v…

738 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question