Cannot figure out why server keeps crashing

Hi.

We have 4 Windows servers 2012 r2 running on a ESXI 5.5.0 1623387.
The hardware is HP Proliant DL380p gen8.

3 of the Windows servers, run fine, without problems.
We do however, have problems with the last. It is used as a terminal server for our customers, running maximum 15 users at the time.
It has been given 16GB RAM, 16 cores (4 cpu's with 4 cores each), 60 GB harddrive and a VMXNET 3 NIC (as the default NIC has problems on 2012r2 servers).

The server crashes all of the sudden, we can't seem to find a reason for it. It is running the same hardware as the others, so it seems wierd.

i've attached dump files.
081414-19578-01.dmp
081414-20718-01.dmp
081514-20531-01.dmp
081614-18562-01.dmp
081614-32734-01.dmp
081714-18359-01.dmp
081714-18453-01.dmp
081714-19109-01.dmp
081814-19343-01.dmp
081814-25734-01.dmp
081914-18000-01.dmp
BPBAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
That seems a lot of CPU!

Is it needed!?

Have you tried reducing two 2 vCPU (sockets!)

Ensure ALL firmware is up to date on the Host, using the latest HP Firmware DVD, also ensure you are using the latest ESXi 5.5, and are you using the HP OEM version?
0
BPBAuthor Commented:
We have a very powerfull server, and only 4 windows servers, so we just gave the resources, because we didn't have anything else to use it on. I'm sure it isn't needed, but why not.

When we got the server, about 4 months ago we ran the latest HP firmware DVD to ensure everything was up to date.
We are using the ESXi version i wrote in OP, and it is a clean ESXi version, as HP recommended.
0
Sajid Shaik MSr. System AdminCommented:
0
Problems using Powershell and Active Directory?

Managing Active Directory does not always have to be complicated.  If you are spending more time trying instead of doing, then it's time to look at something else. For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why

Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
As for you CPUs in your VM...

More vCPUs can make a VM slower....

vSMP (virtual SMP) can affect virtual machine performance, when adding too many vCPUs to virtual machines that cannot use the vCPUs effectly, e.g. Servers than can use vSMP correctly :- SQL Server, Exchange Server.

This is true, many VMware Administrators, think adding lots of processors, will increase performance - wrong! (and because they can, they just go silly!). Sometimes there is confusion between cores and processors. But what we are adding is additional processors in the virtual machine.

So 4 vCPU, to the VM is a 4 Way SMP (Quad Processor Server), if you have Enterprise Plus license you can add 8, (and only if you have the correct OS License will the OS recognise them all).

If applications, can take advantage e.g. Exchange, SQL, adding additional processors, can/may increase performance.

So usual rule of thumb is try 1 vCPU, then try 2 vCPU, knock back to 1 vCPU if performance is affected. and only use vSMP if the VM can take advantage.

Example, VM with 4 vCPUs allocated!

My simple laymans explaination of the "scheduler!"

As you have assigned 4 vCPUs, to this VM, the VMware scheulder, has to wait until 4 cores are free and available, to do this, it has to pause the first cores, until the 4th is available, during this timeframe, the paused cores are not available for processes, this is my simplistic view, but bottom line is adding more vCPUs to a VM, may not give you the performance benefits you think, unless the VM, it's applications are optimised for additional vCPUs.

See here
http://www.vmware.com/resources/techresources/10131

see here
http://www.gabesvirtualworld.com/how-too-many-vcpus-can-negatively-affect-your-performance/

http://www.zdnet.com/virtual-cpus-the-overprovisioning-penalty-of-vcpu-to-pcpu-ratios-4010025185/

also there is a document here about the CPU scheduler

www.vmware.com/files/pdf/perf-vsphere-cpu_scheduler.pdf

https://blogs.vmware.com/vsphere/2013/10/does-corespersocket-affect-performance.html

I'm not sure if your comments, relating to ESXi, are you are using the OEM HP version, e.g. download from HP, but you should be!
0
rindiCommented:
Please next time zip your minidump files. It makes it easier for you and us, as you can upload a single file, and we can download a single file...

Make sure your OS is fully patched, and that you have installed the newest Hyper-V integration features. Also make sure the applications running inside the RDP sessions are fine and up-to-date. It looks as if some of those sessions are ill behaved and don't release all their allocations when they are unloaded. A further counter measure to crashes which is particularly useful on RDP servers it to regularly restart the server, so it's resources get available again.
0
compdigit44Commented:
I ran into an issue recently where a TS server was crashing because of a bad printer driver.
0
BPBAuthor Commented:
I haven't really gotten a usefull answer, or if i did so, i need some more guidance to how to resolve it!

Please somebody.
0
rindiCommented:
As I mentioned in my comment, make sure your Server's OS is fully up-to-date, and also all the software running on it which your clients are using in their remote desktop sessions. Your dumps point to software not releasing all their resources after logoff which were being used during the sessions, eventually leading to the crashes as no resources are left.

Also as I mentioned above, regularly reboot the server before it gets to the crash. Controlled reboots are always better than BSOD's.
0
BPBAuthor Commented:
The server is fully updated, as for the clients. i can't be sure.

Alot of scattered clients, all is minimum running Windows 7 though.

Our terminal users is alot of different companies, working almost around the clock, some in the early morning, others in the evening/nighttime, so we keep our scheuduled reboots to a minimum. Often in relations to Windows Update.

Just now updates all installes runtimes, Silverlight, Air, Java, Java x64, Shockwave.
Adobe Reader is currently in use by a user, and was not updated..
0
BPBAuthor Commented:
I've looked at Windows update, and the 13-08 we installed 25 updates, and then 2 updates the 14-08.

After the 14th, the reboots started.

Could it be a faulty Windows Update?
0
rindiCommented:
Last patch Tuesday did have a bad patch, but I don't know if that also affected your OS, and as far as I understand it caused a different behavior.
0
BPBAuthor Commented:
updates.pngThese are the updates that were installed, does any of them look like red flags?
0
rindiCommented:
It was kb2982791 which caused this issue, but as far as I know that was on Windows 7 and possibly 2008 r2 OS's. I don't think Windows 8.x and 2012 had that problem. But you can always try removing that patch.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Rollback your servers, to the backup you made before patching.
0
BPBAuthor Commented:
We don't backup the windows servers, just the data.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Well if you think it's happened since the last change, you'll have to start uninstalling updates.

We would never apply any updates or changes to any server without ensuring we had a full backup to roll back to. (Good ITIL Change Control Management!)
0
BPBAuthor Commented:
Well, we usally just uninstall updates via remove programs, or else, use system restore if needed.

Our terminal server, really hasn't that much configured, and backing it up everytime we update, is more work than making a new one.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Well, we usally just uninstall updates via remove programs, or else, use system restore if needed.

Our terminal server, really hasn't that much configured, and backing it up everytime we update, is more work than making a new one.

If that works for your organisation, and you are happy with it.

Our Clients can restore VMs, in 15-60 seconds , so the benefit of backup, has many advantages, over re-installing OS from scratch.

You will need to uninstall updates, if these crashes are linked to updates, I've not heard anything recently at present about Windows updates causing issues.
0
BPBAuthor Commented:
Hi again.

I first tried uninstalling the update we discussed,  kb2982791, but it made no difference.
So i then uninstalled all updates that had been installed right before the random crashes started.

It has now been 6 days, and server is stabile.
I will give it another week, and then start installing updates, a couple at the time, to see if the error will come again, and then be able to find out which one is the problem.

Is there a way to keep this question open for 2 weeks?
When and if i find the buggy update, i would like to share it as an answer.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Maybe now would be a good time to backup the server, before applying any updates, and backup before each update!

(or use Snapshots! - but I would not leave it running on SNAPSHOT disk for more than 2 days!)

You should be able to leave the question open....
0
BPBAuthor Commented:
Hi again.

After being spammed by Experts Exchange auto system, i have to respond :P

The server is running stabile still. It was one of the updates.

We have not had the time to try and reinstall the updates for further info on what went wrong, so i'll close the question now.
0
BPBAuthor Commented:
I figured out it was Windows Update by myself.
I did not find out which update was the problem.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
VMware

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.