VMWare HA with DRS and VSA ...

Hi Experts ...

I need some advice, im not sure if i have my wires crossed. Im trying to setup an environment for a client and i continue to keep having problems with how the environment is structured.

What i have.

4x    IBM Servers all with 32Gb RAM, 4x300Gb SAS Drives running RAID 5, Dual Quad Core Xeon Processors, running ESXi 5.5.

First Server:
Just used for setting you two VM's, 1) Server 2008 r2 with Acitve Directory, 2) Server running Server 2008 r2 with SQL Server Express and VMWare vCenter Server 5.5.

Second, Third and Forth Server:
All Running ESXi 5.5 Enterprise editon.

All 3 servers add to vCenter as Hosts.

Installed VSA (VMWare Storage Appliance) This allows us to to create three  NAS storage devices that are replicated across all three hosts. For redundancy.

We have created a resource pool allocating resources for about 60-70% of the cluster performance with 1 host for failover.

What we are trying to do:

We want to install and created the VM's in the resource pool and have the functionality of HA and DRS to vMotion the VM's from one host to the other automatically when a host goes offline or down.

We are finding with VSA installed in the cluster the functionality doesnt work the same, or do i have it wrong ???

Thanks in advance.

Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
VMware HA - if a host should fail due to a host failure, the VMs will be restarted on the other available hosts in the cluster.

It's a Cluster function, so therefore you need to have define a cluster with ALL three hosts.

It's a Host Failure, this means an uncontrolled failure, not a shutdown, restart e.g. pull the power out of the server, this is an uncontrolled failure. If you issue a shutdown or restart, VMware HA will not restart the VMs. This is often misunderstood about VMware HA.

VMware DRS - again, this is a Cluster function, VMs can vMotion automatically around the cluster based on the DRS Setting, e.g. Aggressive etc You will find DRS only kicks in when hosts are heavily loaded. Do not expect with DRS, that hosts will have equal memory and equal cpu, on each, host DRS is not load balancing.

So, what's the issue?
trevsoftAuthor Commented:
Hi hanccocka,

My issue is when i take the power out of one of the hosts in the cluster it keeps returning and saying that the HA failed.

I've had HA and DRS working before, the only difference this time is we are running VSA also. Its like vMotion isn't moving the VM's.
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
vMotion function is not used in VMware HA.

If after powering off a server VMware HA does not restart VMs on other hosts, it would appear, that networking is possibly incorrect.

Do you get an error message?
Big Business Goals? Which KPIs Will Help You

The most successful MSPs rely on metrics – known as key performance indicators (KPIs) – for making informed decisions that help their businesses thrive, rather than just survive. This eBook provides an overview of the most important KPIs used by top MSPs.

trevsoftAuthor Commented:

No the VM's come up disconnected same with the VSA storage device.

Does this have anything to do with Storage DRS, just curious.
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
VMware HA is not functioning.
VMware DRS
VMware Storage DRS

all different functions.

Can you manually vMotion a VM?

Has any networking configuration changed, especially the default gateway information?
trevsoftAuthor Commented:
yeah i know they are all different,

let me try manual vMotion.
trevsoftAuthor Commented:
Ok i worked out two things.

Firstly, you cant vMotion manually or automatically when you still have a CD ISO connected.

Secondly, you need to use Storage DRS to store the VM's having them on eash VSA datastore doesnt do the same thing.

So its all working now.

The only thing i dont under stand, when the VM fails over, it turns off then powers up again. Im not sure why it does that.
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
That is correct, cdroms and floppy drives must be disconnected! We usually run scripts to make sure cdroms and floppy drives are disconnected, or removed from the VMs.

VMware HA - you do not seem to understand how VMware HA operates (maybe!)

VMware HA - in the event of a host failure, RESTARTS VMs on other available hosts! The VM does not turn off, it's failed! It cannot complete a normal shutdown or power off, in event of uncontrolled shutdown, because host has failed.

VMware HA and VMs do not have predictive failover! (e.g. know the host is about to fail!)

it does not use vMotion e.g. Live Migrate VMs to other available hosts, because it does not know that the Host is about to fail.

So this is normal!

Common mistake amongst VMware Admins.

DRS - uses vMotion.

VMware HA does not use vMotion, it just restarts VMs on other hosts, when a host fails, hosting that VM! Hence you can expect 1m-2m downtime per VM.
trevsoftAuthor Commented:
Yeah i understand how HA works, but im not understanding how the VM isnt being migrated live, that it turns the VM off then powers up again.
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Are you discussing VMware HA?

Virtual Machines are NEVER migrated when in a HA Cluster?

VMware HA does not use vMotion!

Do you understand the above, if not I can explain.
trevsoftAuthor Commented:
Can you please clarify then

My issue now is when there is a server failure, the VM that's being moved is being stopped then restarted. Is there a reason this happens?
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Okay, lets' look at VMware HA in kind of Andy's noddy style explaination, please excuse me, if I'm teaching you to suck eggs...

At least two ESXi Host Servers, added to a Cluster with VMware HA (High Availability) enabled. HA Agents are installed on both ESXi servers, and vCenter Server is used to configure VMware HA, but does not take part or control the HA Agents - vCenters role is only to configure VMware.

One server will be the Master, and the other will be the slave, this can be seen in the Host Summary.

VMs hosted on both ESXi Host Servers, become vSphere HA Protecti-ed - and there should be a green tick, which states Protected. This can be confirmed for the VM, under VM Summary.

Do you follow the above, and I will continue?
trevsoftAuthor Commented:
Yes, totally
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
The HA Agents on the hosts...

vSphere HA State - Master - A server which is elected as the master. This agent monitors the VMs on this server, and other operational Hosts, and it WILL attempt to restart VMs on failure.

vSphere HA State - Slave - This server is connected to the Master Agent, via the Management Network. The vSphere HA Protected VMs on this server are monitored by one or more vSphere HA Master Agents, and the agent will attempt to restart VMs after a failure.

vSphere HA Protected VM - vSphere will attempt to restart the VM after a supported failure of the VM.

VM is HA Protected on the following conditions:-

VM is in a vSphere HA enabled cluster.
VM is powered on successfully after a successful user power on.
vSphere HA has recorded that the power state is ON.

When an ESXi Host Server Fails (which is part of a VMware HA Cluster), all the Virtual Machines, which are hosted on that Host, also go down, e.g. fail.

A Host Failure could be:-

1. Pink/Purple Screen of Death - caused by memory fault.
2. Pink/Purple Screen of Death  - cause by cpu fault.
3. Power supply failure (if only a single power supply)

A Host Manual Shutdown, reboot, restart is not considered a host failure. Because it's a controlled shutdown.

So we have a Host which has failed, and ALL the VMs it was hosting are now DOWN!

Any questions so far?

Do you follow the above, and I will continue?
trevsoftAuthor Commented:
Yes, that's right, the host went offline (because i pulled the power out) and instead of leaving the VM's running and moving them to another host.

It turned then off, moved the VM's and then restarted them.

Is there a way to do this without turning off the VM, maybe its just a setting.
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
All your VMs went down because the host failed (you pull out the power). The host did not turn them off, because it did not know it was about to fail, for it to know it was going to fail the Host Server would have to know, you were going to pull the power out.

So the Host is Down, VMs are Down. (because you pulled out the Power!).

Because as we have discussed earlier, the VMs are being monitored by the HA Agent, vSphere HA Protected VMs will be RESTARTED on other AVAILABLE HOSTS.

This is the function of VMware HA, Restarts VMs automatically when a host, hosting VMs fails.

The technology you desire, would need to know, an event is coming which could take down the VMs, and move them before the server fails! - Impossible! Server's do not have crystal balls, and cannot see into the future!

That technology would require predictive future event analysis.

It does not really move the VMs, because your VMs are DOWN, all VMs are soted on shared storage, it just restarts them on new hosts - VMware HA.

Your VMware HA is running as per design.
trevsoftAuthor Commented:

Ok thats all i need to know,

i wasnt sure if the VM's shutting down was meant to happen.

But the way you explained it makes sense.

Thank you for all that.
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Host goes down, VMs will go down that are host on it, there is no way to prevent that, in a failure.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.