Purple Screen of Death, IBM Vmware server

Purple Screen using VSphere on IBM Server.

Over the last 6 weeks or so, the server has seemingly locked up 5 different times, to the point a hard power down and restart has to be done.    The server is an IBM server with 3 virtual servers set up on it:

AD - Windows Server 2012 R2
File/RWW Server - Windows Server 2012 R2
Exchange - Windows Server 2012

It is running Vsphere and Vmware ESXi 5.1.0

Normally the screen in the server room is black with white text on it, when the server goes down and locks up that screen changes to purple.   I don't now a ton about vmware, or IBM.  There is a internet cord plugged into the IMM port but I don't know the IP address to check for errors in IMM; it's a server I can't take down often as we have employees spread out through out the country, so no real time to get take it down and try and figure it out.

In Vsphere I can't find anything in the logs, or the event viewers on the individual servers.

Anyone know of anything I can check?
LVL 1
FosterThomasAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

FosterThomasAuthor Commented:
I should add, there is nothing similar in the time or day of the week it locks up, it's been on a sunday during the day, friday during the night and anywhere in between
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
What you describe is a Purple Screen of Death, a hardware host crash!

Cause usually hardware, driver fault or bug within the ESXi OS.

1. Update ESXi 5.1 to the last and latest version. which is ESXi-5.1.0-20160504001-standard (Build 3872664), released in May 2016.

2. Check the VMs, are using the VMXNET3 interface and not the E1000 interface, there are issues and known bugs with the later.

3. Check host hardware for memory faults using memtest86+, check network interface firmware, check storage controller firmware. - e.g. update it all.

4. Check CPU, Fans, Heat sinks are all working correctly.

5. Run Server diagnostics...

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
FosterThomasAuthor Commented:
This sounds like a stupid questions but where do I update ESXi from, through Vsphere?    I am currently running 5.1.0 1065491
Problems using Powershell and Active Directory?

Managing Active Directory does not always have to be complicated.  If you are spending more time trying instead of doing, then it's time to look at something else. For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why

Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
not been updated for a while

if you put the host in maintenance mode, and type the following at the console or remotely via SSH

(all VMs need to be off!)

this will update to last ever patch...

esxcli network firewall ruleset set -e true -r httpClient
esxcli software profile update -p ESXi-5.1.0-20160504001-standard -d https://hostupdate.vmware.com/software/VUM/PRODUCTION/main/vmw-depot-index.xml
esxcli network firewall ruleset set -e false -r httpClient

Open in new window

FosterThomasAuthor Commented:
Thanks, I will this, this weekend or a time when I can power down the servers, hopefully soon
FosterThomasAuthor Commented:
by off you just mean in vsphere go to each VM and shut down the virtual machine correct?
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Correct you cannot enter maintenance mode and update Host without turning off VMs
FosterThomasAuthor Commented:
Thank you, I will choose your answer as best solution as soon as I can apply this
FosterThomasAuthor Commented:
One other quick question, I see in Vsphere that was have 32GB of ram, and we are constantly hovering around 31GB's used, is that normal on VM server like this?   Seems our Exchange server which is used by 70 some users daily, is crushing the memory.
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
No it's not normal!

If you are using 31 out of 32GB ram you have either overcommitted or you need to increase ram in host!
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
What is RAM allocated to VMs?

You only have 3
FosterThomasAuthor Commented:
See the attached for Allocation
Capture.JPG
allocation.JPG
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
With that current loading you either need to upgrade RAM in Host!

32GB ram in a server is very low my workstation here has 64GB!

Or reduce ram in VMs which are not using it
FosterThomasAuthor Commented:
If it reaches 32 GB's could that cause the purple screen?  Though times it has happened like a sunday morning, no one is working so the Ram cannot be used that high at those times.
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
To be honest with you you should not let it pass 28Gb

Because the host server has no ram to give to will start swapping to disk which will cause a slow down of Host and all VMs

What memory have you allocated to a Domain Controller?
FosterThomasAuthor Commented:
Look at my picture attache a few posts up, under the memory tab in allocation it just has unlimited written next to all of the VMs
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
edit each VM, what memory has been allocated
FosterThomasAuthor Commented:
The purple screen happened again first thing this morning, tomorrow at 5pm my time I am shutting down the office and trying to update ESXi is there any thing else I should be looking for while on the console in maintenence mode?

When I edit each VM in memory it just says unlimited (in MB's) should I set each one to a specific amount instead?
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
The purple screen happened again first thing this morning, tomorrow at 5pm my time I am shutting down the office and trying to update ESXi is there any thing else I should be looking for while on the console in maintenence mode?

no, other than those commands, assume your ESXi server has access to the internet

Please see attached, when I click Edit VM Settings (every VM, must be allocated memory, which is not unlimited!)

2018-02-21-13_27_51-CYRUS-VCENTER1--.png
there will be a value for each of your VMs ?
FosterThomasAuthor Commented:
Sorry misunderstood Capture.JPG all 3 VM's are set to 12GB's which would add up to 36GB's but they are also grayed out so I can't change them

I have:

AD/DC
Exchange
App

All three are set to 12GB's
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
You CANNOT change them when the VM is powered on.

You DO NOT have enough memory to set all VMs to 12 GB each!

Your DC is not going to need 12GB, I would reduce to 4GB.

Not sure what App is, but exchange probably needs more.

Bottom line, your host does not have enough memory.
FosterThomasAuthor Commented:
Could that be the issue of the purple screen?  I have never messed with those settings before so they have all been set to 12GB for 3+ years since we purchased the server.

The App server, is where we host all of our company files that people shortcut to and where RWW and Microsoft server essentials is housed.

So when I power down the VM machine it will still appear in that list and i can edit at that time?
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
if it's been like that for 3 years, with no issue, something else is faulty!

It could be memory in the host which is being used...

So when I power down the VM machine it will still appear in that list and i can edit at that time?

Yes.

I would be inclined to update the host.

then I would change AD/DC to 4GB, App to 8GB

then add 4GB to Exchange.

that will reduce the memory load on host, and should not affect the performance, but you  can monitor that!
FosterThomasAuthor Commented:
Hopefully last stupid question, I power it down, and put it in maintenance mode, then go to the server room and enter the commands you sent above.     With the server being down and housing the DC will it have internet?
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
With the server being down and housing the DC will it have internet?

no idea!

I have no idea, how your internet is configured in your organisation...
FosterThomasAuthor Commented:
haha I understand that,

What I meant was the servers have to be shut down to enter maintenance mode, if I don't have internet when trying to enter the commands on the console, how do you update ESXi then?  Because it has to be in maintenance mode with internet if for some reason I don't have internet does that mean I cannot update ESXi?
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
You'll have to do it manually, download the update (when you have internet access!)

manually upload to host and update it.

see here

HOW TO: Upgrade VMware ESXi 5.1 to ESXi 6.0 in 5 easy steps

but you substitute ESXi 5.1 update file, rather than ESXi 6.0!
FosterThomasAuthor Commented:
That way looks much harder, hopefully the way above works

It's as simple as shutting down VM's, entering maintenance mode and typing those commands?   Do you have a guide for doing it that way?
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
It's as simple as shutting down VM's, entering maintenance mode and typing those commands?

yes.

Guide:-

1. Shutdown VMs.

2. Enter Maintenance Mode.

3. At console or remotely via SSH

type above!

Do you want a guide for that ?

before you even try the commands, you may want to test Internet access, when your DC is down...

e.g. ping www.google.com at console (if it fails, you have no internet!)
FosterThomasAuthor Commented:
thank you sir, VMware is new to me
Roshan MohammedCloud Engineering OfficerCommented:
it may also pay to update the firmware of your server and it's components.
FosterThomasAuthor Commented:
The IBM firmware?
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
How did your update go, and reducing memory in VMs ?
FosterThomasAuthor Commented:
I am doing it in three hours, at 5pm our time when I can get all the employees out of here
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
ok...
FosterThomasAuthor Commented:
So I am in maintenance mode, at console I pressed f2 and turned shell, hit alt f1 and went to shell

The first time I typed in login and password just fine, hit enter and started entering the lines above.  

1) How long should it take once I hit enter after second line above? the cursor just flashed and flashed.  Then it when I hit alt f2 to go back to main screen and then went back to shell it is said it was a bad url, I verified I typed exactly as you had it above.   So I went back into shell to try again and now I can type in the login name but when I get to password it won't accept any characters, it just flashes and wont type.  I can hit enter and it says wrong username and password, but it won't physically let me type anything in the password field.  So I can't even log back in to try it again, the keyboard is working fine because I can enter the username and hit alt f2 and go back to main console
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
if you are referring to this line

esxcli software profile update -p ESXi-5.1.0-20160504001-standard -d https://hostupdate.vmware.com/software/VUM/PRODUCTION/main/vmw-depot-index.xml

depends on how fast your internet connection is.... but I did suggest you check if your ESXi host has internet access first! (and as 5.1 is End of Life, support could be removed for it, and hence maybe no updates!)

Bad URL suggests no internet.

You will have to complete a manual update, or forget updating, make the VM memory changes, and wait if a crash occurs again!

if it's happening so often (host crash), you'll know within 24 hours!
FosterThomasAuthor Commented:
Next issue took server out of maintenance mode, rebooted server, brought three VMs back and AD/DC has internet access but no other servers or computers do.    Frustrating
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
I would post a new question on Internet Access, for Experts to work through some troubleshooting...
FosterThomasAuthor Commented:
I got it back and working

Why after the first time would shell not allow me to enter the password but all other functions on the key board worked?
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Why after the first time would shell not allow me to enter the password but all other functions on the key board worked?

no idea, use ssh. (remote)

maybe you foo bared the console with you fuu man chu!

glad you internet is back, change memory settings on VMs and monitor performance.

if you need more physical memory in host, you'll need to purchase and install!
FosterThomasAuthor Commented:
I changed memory and now it's hovering around 21gbs used where before it stayed at 31gbs

Is ssh something I have to download
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
21GB is much better, because the host now also has some memory to use, and it's not all used.

It's possible you could have a high memory DIMM fault, or ESXi 5.1 is not happy being heavily over committed

I would personally wait and see if you get another crash, within 24 hours.

as for SSH.

Part 5: HOW TO: Enable SSH Remote Access on a VMware vSphere Hypervisor 5.1 (ESXi 5.1)
FosterThomasAuthor Commented:
The crashes aren't everyday, we've had 4 purple screens of death in 5 weeks just happen at bad times, well any time the server is down it's bad but you get the idea
FosterThomasAuthor Commented:
This morning the Server is back to averaging 31GB's of RAM    

AD - 4 GB's
App - 8 GB's
Exchange - 14 GB's

that's 26GB's   is the host really using 5 GB's of RAM?
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
The VMs which have been defined as 4, 8 and 14GB cannot use any more than what has been defined.

So a total of 26GB Max allocated to VMs.

In a server which has 32GB....its close.

What does the summary state for the host, a screenshot would help...
FosterThomasAuthor Commented:
I talked to the guy who was here before me and installed the server, finally.

He said the patch for ESXi is installed on an internal USB drive?  That updating through the shell won't work because it is not stored on the HDD and is stored on USB, have you seen that before?

Also to upgrade to 64 GB's I just asked for a quote from my provider, but they are saying the free version allows a max of 32GB, that I need VM Essentials License to upgrade RAM
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
He said the patch for ESXi is installed on an internal USB drive?  That updating through the shell won't work because it is not stored on the HDD and is stored on USB, have you seen that before?

what twoddle and foo bar!

Most if not all ESXi installations are installed on SD or USB flash drives. How does your guy suggest you update it then!

Also to upgrade to 64 GB's I just asked for a quote from my provider, but they are saying the free version allows a max of 32GB, that I need VM Essentials License to upgrade RAM

5.1 free version is limited to a max of 32GB.

5.5 and higher have removed that restriction in FREE.

HOW TO: Upgrade from VMware vSphere Hypervisor ESXi 5.1 to VMware vSphere Hypervisor ESXi 5.5 for FREE

HOW TO: What's New in VMware vSphere Hypervisor 5.5 (ESXi 5.5)
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Exchange

From novice to tech pro — start learning today.