ESX guest balancing

We have two vmware hosts in a cluster. We have the resources to run all hosts from one server and use the other as a HA failover server if necessary.  Should we be doing this?  What's the best practice recommendation here?

Ads we halving the likely hood of having an outage if we just run all machines on one host?
LVL 1
wannabecraigAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

kyodaiCommented:
It depends on how critical the guest machines are. If a guest machine is critical to your business then best practice is to reserve enough resources for it on both - the main host and the failover. If you can "Live" with a temporary outage or shut down other guest machines in case of emergency it is also OK to overload the machines a bit.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Yes, if you have the licensing, it will certainly use expensive hardware you have purchased rather than it sitting idle.

Maybe have a look at DRS (or do it manually).

VMware Distributed Resource Scheduler (DRS) - Dynamic Resource Balancing

http://www.vmware.com/products/drs/overview.html

VMware Distributed Resource Scheduler (DRS) Product Briefs

http://www.vmware.com/files/pdf/VMware-Distributed-Resource-Scheduler-DRS-DS-EN.pdf
http://www.vmware.com/pdf/vmware_drs_wp.pdf
0
wannabecraigAuthor Commented:
Maybe the Q was to vague.

There are about 6-7 critical machines, two sets of interdependent ones.  So to keep the external side of the business up we need a webserver, api and BD. These are linked and there is no point in one being up if all three aren't.
Our internal CRM is also like this.   It needs an API and a DB server.  So, the question is, it the servers are balanced between hosts, nad one host goes down, it as it takes all servers down anyway, should they all be on one host, to reduce the chance (by half?) of the all systems being rendered nonfunctional?
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
VMware HA will restart VMs with 1-2 minutes.

So do you need better availability than this?

If you do, then you should use Microsoft Failover Clustering, and have two nodes VMs per host.

In the event a host fails, service will still be available, despite, a few seconds to account for failover,

If you require higher availability than this, you should look at VMware FT.

BUT what if your SAN fails ? all you storage your VMs are on ?

What would you do then ?

If you were using DRS, then you would group all these VMs together as a group, and the group would be moved as a whole group between hosts.

As you need this whole group of VMs to function as a "vApp",if you put them all on a single host, and it fails, 1-2 minutes you will be up and running again, if you spread them between hosts, you'll get better performance, and the outage time is the same 1-2 minutes.

So spreading the VMs between hosts will give you better performance, and the same outage time of 1-2 minutes.

if you want better, consider Microsoft Failover Clustering, but with increased costs of OS License and VM Management, to double up all VMs.

or consider VMware FT.

But also.....what about storage failure ?
0
wannabecraigAuthor Commented:
The question of the SAN failing is not within the scope of this discussion.
This is for when a host fails and the best practice surrounding it.

When the guest fails over to a second host it restarts right?

I'm just wondering if it's best to have them all on the same so, as if one crashes they all crash.
We have plenty of extra resources to deal with all running on one host without the need for DRS.

We have SRM for a SAN crash, just as a matter of interest.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Okay, so you have decided by your design, that a small outage is acceptable.

I would opt for - So spreading the VMs between hosts will give you better performance, and the same outage time of 1-2 minutes.

if you put them all on the same host (seems a waste of hardware and license to have idle doing nothing!).

but you are down to the same 1-2 minutes.

I don't think there is a best solution here, both have an outage of 1-2 minutes, waiting for HA to discover the VMs are not responding, and then restarting the failed.

However, based on your applications, is it better for them ALL to fail, than just half ?

You would have to test this....

We would spread the load across hosts, performance, less VMs to restart, less VMs to move, should you require to do maintenance, restart hosts, less VMs affected, if services fail on Host, and HA and vMotion cannot be used.
0
wannabecraigAuthor Commented:
If a host has a 1-1000 chance of failing and hosts are linked, then spreading  them gives it a 1-500 chance of failing?
If it's all on one host, then it's 1-1000 again right?

The performance issues is a non-starter, hosts are massively overspec'ed so we can run all more than comfortably on one host.  When you say both have a 1-2 minute outage, that's not quite right. They both have it, but half the amount of times.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
We would spread the load across hosts, performance, less VMs to restart, less VMs to move, should you require to do maintenance, restart hosts, less VMs affected, if services fail on Host, and HA and vMotion cannot be used.

If performance is not an issue, there are still other factors when hosts go wrong, and you cannot HA, vMotion.

BUT if your applications, front-end and back-end are designed to work together, then leave them on the same host, you have the advantage of knowing what these applications and servers are, and how they work.

So leave them all on the same host, and leave the other host running as a "hot spare"

We don't worry about host failure anymore, because we treat hosts in a cluster, like disks in a RAID set.

We just keep adding ESXi hosts.... if a host fails fine, dead, as long as there is enough resources in the cluster that's fine.

We don't find with todays technology that the hardware fails, but the software does!
0
wannabecraigAuthor Commented:
Sure, but there are are only 10 servers, it takes approx 1 min per guest to migrate for maintenance.
I'm not overly concerned about that.

Why can't I HA?  If all are running on one host, the other is a hot spare, with HA enabled, if the live host goes down the can HA over to the other on?
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Remember vMotion moves a live running VM process, HA restarts VMs on another hosts, when it recognizes a host has failed.

Yes, HA will take over, and start restarting VMs. (provided it has been tested and configured correctly).

But it's not instant, and you will experience service failure, and downtime, and outage, and compared to Microsoft Failover Clustering it's a longer outage.
0
wannabecraigAuthor Commented:
The licensing and setup modle on clustering makes if far more of a setup &  admin nightmare for us.
We've optioned for ESX and SRM and we'll live with any downtime.

We're really just looking to do what's statically the smartest thing to do given the linked guests.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Leave them all on a single host, and keep it simple for yourselves.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
VMware

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.