Hyper-V Machines in Stopping State

Hyper-V Machines in Stopping State?
It's not because of Hard drive space 472 GB Free of 1.09TB
30 VM's of which 14 are in stopping state..

Server 2016 Datacenter
HP DL380 G6
96 GB Ram
2x e5649  6 core CPU's2018-12-30_20-14-31.png
LVL 91
David Johnson, CD, MVPRetiredAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Tom CieslikIT EngineerCommented:
Hi David.
You've said you have 30 VM's right ?
Your Host has 2 processors 6 core each,,, so 2x6x2 = 24 threads (Virtual Processors)
Microsoft VM Best Practice rule is to not go beyond virtual processor limit - 2 vp for host.
Actually mathematical  definition is  maximum VP - 2 for Host OS.
If for some reason few of your VM has more than one VP assigned, you have a bigger issue here.
Maybe this is your problem,,, you have too many VM's on single Host for this Hardware configuration !!!
Philip ElderTechnical Architect - HA/Compute/StorageCommented:
On the host bring up ResMon.EXE and head to the Disk tab. Does Disk Activity show any VHDX files being worked on?
Shaun VermaakTechnical SpecialistCommented:
What do the event logs say?
5 Ways Acronis Skyrockets Your Data Protection

Risks to data security are risks to business continuity. Businesses need to know what these risks look like – and where they can turn for help.
Check our newest E-Book and learn how you can differentiate your data protection business with advanced cloud solutions Acronis delivers

David Johnson, CD, MVPRetiredAuthor Commented:
1 Vcpu per machine..  Disk and cpu contention would only make things run super slow.

I had to actually restart the host and the vm's restarted and then there was a bunch of altaro temporary snapshots being removed.

There was an Altaro backup running at the time.  one lab has altaro and another has veeam B&R
Tom CieslikIT EngineerCommented:
OK, but still,,,you have more VM's than available Vcpu's
You should not because strange thinks can happen.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
David Johnson, CD, MVPRetiredAuthor Commented:
I will split the vms and use them on the replica servers to see if the vcpu's are the problem
Philip ElderTechnical Architect - HA/Compute/StorageCommented:
If there were Altaro snapshots happening at the time then that's the source of the "issue". That would have become readily apparent as well with the Disk Activity check.

What is the storage subsystem setup?

Break the Altaro backup schedules up to stagger the VM's backups. Don't do them all at once. A VSS snapshot is extremely I/O intensive thus the stalls.

EDIT: As an FYI:
1: CPU is very rarely the bottleneck. The storage subsystem is almost always the place to start with storage to compute fabric being the next step. Since this is a standalone host, storage is it.
2: A VM should have at least two vCPUs assigned. If there's a runaway process on the VM with only 1 vCPU assigned the VM is toast to access.
3: PerfMon has both host and in-guest counters. Use those to verify where all of the host's systems are at relative to load.
David Johnson, CD, MVPRetiredAuthor Commented:
The problem is that there was 0 disk activity and it is set to do 2 simultaneous backups down from the default of 4
Philip ElderTechnical Architect - HA/Compute/StorageCommented:
The catch for me is that we've run into Altaro and VSS snapshot issues in the past. I'm still leaning towards that being the source of the problem.

vssadmin can be used to delete all snapshots. That's one of the troubleshooting steps that I suggest taking.
vssadmin delete shadows /all

Open in new window

David Johnson, CD, MVPRetiredAuthor Commented:
I have been using Veeam B&R and given the hype about altaro being equal but easier to configure I thought I'd give it a shot. Didn't notice the problem until I checked my mail and saw all of the server has been down for more than xx minutes from spiceworks.  Went into Hyper-V Manager and found they were all in stopping mode..  Get-Vm hung up ... Since this is not production stuff.. Did a Shutdown/restart
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.