Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium



Posted on 2009-05-06
Medium Priority
Last Modified: 2012-05-06
We setup an ESXi box just over a month ago. There are 4 VMs running out of it. It has been running fine for weeks, but all of a sudden in the past 3 days, it has Pink Screened Twice.

The First time - it was an Out of Memory Error

The Second time - it was an unknown error ( I have attached this screen shot )

Unfortunately, The Raid card we have in our server was not fully supported at the time we installed ESXi, so because of that ESXi is currently running off of a USB key.

Server Specs:

Mobo: Supermicro x7DWU
CPU: 2x Intel Xeon E5405 @ 2.00GHz
Raid: 3Ware 9650SE
HDD: 6TB (Raid 10) - 3TB usable.

The pink screen doesn't bother me all that much - mostly because I'm a Windows guy =P
That is - it would be really nice to get rid of the Pink Screen all together.

But the biggest part that bugs me, why doesn't the pink screen auto reboot.
Is there a way to force the Pink Screen to auto reboot?

Please post any suggestions.


Question by:svelluto
  • 3
  • 3
  • 2
  • +2
LVL 19

Expert Comment

by:vmwarun - Arun
ID: 24316472
PSOD or Purple Screen of Death can occur due to multiple reasons.

If it occurs during installation, then the most probable reason is trying to install ESX/ESXi on unsupported hardware or on hardware not listed in VMware HCL.

Sometimes the installation may be successful, but when the ESX Server loads, it may throw PSOD because of the inability to load unsupported drivers.

Expert Comment

ID: 24317182
I would suggest that you didn't create a diags partition imediatly after you installed the ESX OS.
Therefore when the PSOD occures it doesn't have anywhere to dump the log.
Maybe you could move ESXi to a much larger memory stick, create a digs partition and see if this helps
The diags partition will be about 100MB.
You create it via disk management under configuration.

Author Comment

ID: 24317569
For the three week prior to these two stalls, ESX was working just fine, and then it was like out of the blue, two PSOD's in a couple days.

ESX is currently running on a 1GB USB key, so it should have sufficient space remaining.

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

LVL 21

Accepted Solution

za_mkh earned 1000 total points
ID: 24319538
It looks like it has an issue with one of your Physical CPUs. Maybe try to disable 1 or check that the processors are still securely plugged in? Same goes for memory.
Another thread I found after googling one line from your PSOD screenshot seems to point to ensuring BIOS is upto date include microcode updates?
Hope this helps

Author Comment

ID: 24321389
thanks - i'll take a look at the bios update.

As for the CPU and the Memory securely plugged in, this is a server sitting in a rack at the datacenter, so it may take some time to check that, the BIOS i can check next time i reboot it.


Expert Comment

ID: 24326614
Do you have a diagnostics partition?

Author Comment

ID: 24327736
A diagnostics partition?
I don't think so, How do we set that up?
LVL 19

Expert Comment

by:vmwarun - Arun
ID: 24338002
The default partitioning layout of a normal ESX 3.5 Server would be

/ (root) - 5 GB
/boot - 100 MB
/var/log - 2.5 GB
swap - 544 MB
vmkcore (Diagnostic Partition) - 100 MB
vmfs - Remaining Space.

vmkcore is used to store the diagnostice info when the ESX/ESXi Host PSODs.


Expert Comment

ID: 24338335
via the VI Client go to configuration-> Storage. Add Storage
It will be the 3rd option.

Expert Comment

ID: 24805579
Your PSOD shouldn't auto-reboot as you should be left with a prompt to 'Press Escape to enter local debugger', which I assume is the faux COS console.  It acts like the dump didn't complete for some reason.

Not that I'm a guru or anything, but my understanding with ESXi (and I'm emphisizing the 'i' which everyone seems to overlook and makes VMWare support cringe even though their sales reps are pushing it) you do not have the option to alter the partition build.  The installer does this automatically and should create the vmkore partition (per http://www.boche.net/blog/?p=120 ).  However, you can use the unsupported enabling of a faux COS console an manually create one, as per the link given above and below, it is possible to trash the default install.

Now getting that dump file isn't so easy peasy... http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1004128 dictates how to do this from ESXi (emphsizing the 'i') .  This article specifically denotes that the 'USB key for ESXi Embedded contains a VMKore parition".  It seems you have to get into DEBUG mode which the PSOD 'should' allow you to do.. however, my experience with the 1 I've suffered, I could not get into that mode.

I can see that I have the VMKore paritition within the VI Client by going to Configuration tab, Storage, right-click and choose properties on the ESXi system disk and the Extent Device pain shws VMWare Diagnostic of 109MB.  You can also add it this way or possibly as Markzz indicates above (I don't have any spare disk space on my system to verify I can create a VMFS diagnostic -fc- type parition).

Being a USB version you could mount the USB /dev/disks/vmhba32:0:0:7 volume on another *NIX box and retrieve the file.  I'm not *NIX savy enough to know if you could access this from a Windows box (although the article notes you can use Disk Management/FDisk to see the VMKore parition exists.)

You cannot browse the VMKcore parition in the Datastore Browser, nor access it with the freeware Veam Backup & FastSCP (not on v3, I've not tried the v4 just released).  Nor can I verify any of this is valid for vSphere4i.

Sorry I'm not giving you a direct answer, but its the best I've got.  I'd say that if you do have a VMKore parition and a diag of the USB key integrity pans out (I don't know how to do off the top of my head from a *NIX box... DO NOT try and run a chkdsk from a windows mount), then its likely a RAM, CPU or Mobo.  The dump is supposed to help give you more info, but would suspect that normal hardware diag and/or process of elimination by removing/disabling various components will end up being what you'll have to do.

Hope this helps at least with places to look for info...

Featured Post

Get your Disaster Recovery as a Service basics

Disaster Recovery as a Service is one go-to solution that revolutionizes DR planning. Implementing DRaaS could be an efficient process, easily accessible to non-DR experts. Learn about monitoring, testing, executing failovers and failbacks to ensure a "healthy" DR environment.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this article, I will show you HOW TO: Create your first Windows Virtual Machine on a VMware vSphere Hypervisor 6.5 (ESXi 6.5) Host Server, the Windows OS we will install is Windows Server 2016.
August and September have been big months for VMware—from VMworld last month to our new Course of the Month in VMware Professional - Data Center Virtualization. We reached out to Andrew Hancock, resident VMware vExpert, to have a more in-depth discu…
Teach the user how to install log collectors and how to configure ESXi 5.5 for remote logging Open console session and mount vCenter Server installer: Install vSphere Core Dump Collector: Install vSphere Syslog Collector: Open vSphere Client: Config…
This tutorial will walk an individual through the steps necessary to enable the VMware\Hyper-V licensed feature of Backup Exec 2012. In addition, how to add a VMware server and configure a backup job. The first step is to acquire the necessary licen…
Suggested Courses

581 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question