VMWARE HA Resource monitoring

I have a dual server host VMware HA configuration and had an issue with servers vmotioning when one of the hosts went down.

Some of the VM servers that vmotioned from the down server would not come online because there were not enough resources for all the VM's to start on the remaining host.

How do I monitor or see the resource load between hosts to prevent over allocating, thereby making HA ineffective?
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Zephyr ICTCloud ArchitectCommented:
You could always check the cluster information from the VirtualCenter inventory, when you click on the cluster name the Summary page should display high-level information about it...

It should display the cluster's admission control setting, current failover capacity and the configured failover capacity ... And extra information regarding RDS if it is enabled/available.

The state uses colors to make it possible to quickly see the status of the cluster also.
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
You will always need enough resources on server which remains up.

So if you are using 32GB of Memory on the Server A and Server B, and Server A fails, you will need at least another 32GB on Server B.

You either need to increase RAM in your host servers, of purchase an additional server.
Mr TorturSystem EngineerCommented:
When your host fails there is no VMotion. A VMotion is when you migrate a living VM fro a living host to another living one. When HA detects a host failure it restarts the VMs from the failed host to the remaining ones, assuming there are some hosts remaining and that they have enough ressources to start the failed VMs.
SolarWinds® Network Configuration Manager (NCM)

SolarWinds® Network Configuration Manager brings structure and peace of mind to configuration management. Bulk config deployment, automatic backups, change detection, vulnerability assessments, and config change templates reduce the time needed for repetitive tasks.

jdr0606Author Commented:
Sorry I used vmotion improperly.

The attached image shows the state of the two HA hosts and the resource usage.  Looking at the stats it would appear that when host 2 failed there should have been enough resources on host 1 to handle the additional load from host 2, am I reading it correctly?

  VMWare host resources
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
if there was enough resources, on Host 2, to satisfy VMs on Host 1, they would have started.

You have three options

1. increase memory on hosts.
2. purchase an additional host.
3. Disable Adminission control (this will enable ALL the VMs to start on Host 2 (1), in the event of failure, but you may find that the host, runs out of resources, and starts to go slow, and that will affect the performance of ALL VMs.

So, at present, can you let me know

1. Total CPU and Memory on Host 1
2. Total CPU and Memory on Host 2

from what I can currently see, you are very close, and borderline...

e.g. if you take that 50% on Host 1, and put it on Host 2, which is approx at 25%, that would be 75% approx!

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
jdr0606Author Commented:
It was the other way around.

Host 2 failed so the VM's on host 2 were attempting to restart on Host 1.

The two host's each have CPU Capacity 16X2.699 GHZ and Memory 262074 MB
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Okay, so it's still 50% + 25% = 75% approx!

look at the Summary for each host - now, and what is the CPU and Memory % ?
jdr0606Author Commented:
Even at 75% I still should not have had resource issues, when host 2 failed, correct?
Mr TorturSystem EngineerCommented:
you should not have had issues if the ressources usage is constant.
But it is not, and your statistics printscreen is the ressource usage at one moment.
Maybe more ressources were used when host 2 failed.
From my knowledge of HA, and I tested it a lot of time, as Andrew Hancock I would say if there was enough ressources the VM from host 2 would have restarted on host 1.
Do you have one VM that has a very high vCPU, Memory assignment? If so this will throw off your HA slot size and will not be based off of your largest VM.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.