ESX 3.5 cluster creation best practices

Hello Experts,

I have a 6 host ESX 3.5 datacenter Virtual center 2.5, with 42 virtual machines. The back end storage is both fiber and iscsi connection to SAN. There are licenses for vmotion, DTS, and HA but no clusters are configured..

I want to take advantage of clustering and have everything setup before I start to prep for migration to 4.0. I have been looking at the resource management guide and feel I have a pretty good handle on the concepts but I was looking for some steps and best practices in configuring clustering. When I add hosts to clusters does it affect the VMs on that host? What can go wrong when I do this? Can I undo it?

Any help and guidance is greatly appreciated as always.
Who is Participating?
VMwareGuyConnect With a Mentor Commented:
Nobody has given you what you really need, I will address what you are really looking for:

1)  Is your default gateway used by your ESX hosts pingable?  Verify immediately.  IF it is not pingable your HA cluster will fail when you try to create it.  To avoid this, you create an advanced parameter by clicking the advanced tab in HA settings and then enter das.isolationaddress and set the value to a pingable IP address you deem fit to serve as the IP address used by the HA cluster to determine whether or not the host has become isolated from the network.  YOu then create another parameter call das.usedefaultisolationaddress and set a value of FALSE.  You can a couple of IPs using this method, but if you do you should also increase the timeout value, this is das.failuredetectiontime, the default is 15 seconds, increase it to 30 seconds when more than 1 isolation address is used.  Refer to your ESX 3.5 resource management pdf, I've attached it for you.

2)  Next, do you have virtual machines running on the same network as your ESX service console connection?  If you do, then within HA settings, make sure you set the network isolation response to "shut down" and do not set to keep them powered on.  If the host loses its network connection, you typically don't want it to shut down VMs because they may all still be running fine because they are on a diffferent network mapped to different nics, or possibly even to a different switch etc, but if they are on the same network as the ESX service console, and the service console loses its network for some reason, there is a solid chance the same reason your service console lost connectivity could also effect the VMs that run on this same network. Think of this as it applies to your network.

3) finally, make sure you look closely at your VMs and what they do, define your virtual machine startup to prioritize the important servers that should be up first in the event of an outage, like your DCs and DNS should always take priority so when other servers power on and boot they will be able to reach the DC and sync time (if using w23time) and validate the machine account within kerberos AD.  SQL should come up before apps becuase apps could fail if they don't have their DB available during boot, this happens to vCenter all the time.  

These are the basics of best practices, and in fact, there is a question on VMware's advanced datacenter design exam that looks for this exact information,I just took it.. know it!  

Most engineers don't drill deep enough into this to configure it properly for their environment.  You now have a higher level of understanding of VMware HA.      
Danny McDanielConnect With a Mentor Clinical Systems AnalystCommented:
Just create a cluster but do not enable DRS or HA, drag your hosts into it and then enable the feature(s) you are going to use.

If you need to remove a host from the cluster it will tell you that the host needs to be in Maintenance mode first, but you can get around that by rt-clicking on the host and selecting 'disconnect'.  At that point you can drag the host out and then rt-click and connect and it's back in business and the VM's keep running.

When you enable DRS, all of the work is done on the VC so nothing affects the VM's unless it decided that it needs to balance the load some and it vmotions some VM's.

HA enablement installs another agent onto the host and then does some checks as a process of enabling it.  VM operations shouldn't be affected by this, either.  The worst that happens, typically, is that it fails one of the checks and HA doesn't enable.  Then you have to figure out what the problem is and there's an option when you rt-click to re-enable HA on a host and by using it, it will go through the process again to get HA running on the selected host.

You don't need to have your hosts in a cluster to use vMotion, btw.  That will work as long as you have the hosts in the same datacenter and they meet all of the other requirements.

Overall, it is pretty safe to setup a cluster and enable the various features, but you should have the latest version of VCenter installed (you can install VC 4.0 now if you like) and your hosts should be updated to the later versions of ESX, too.  rule of thumb...VC version/update level should be the same or higher than the hosts version/update level
Deepak KosarajuConnect With a Mentor DevOps EngineerCommented:
-> First download the following tools from vmware CPU Identification utility to validate your server are ready for vsphere-4 for vsphere u r CPU has to be 64bit and can run 64bit VM's-
-> And another tool you can try without burning to iso is VMWare CPU Host Info -
Limitation of above tool is it will just show if your host CPU is VT Capable, meaning that it can run 64bit VM's

-> Once you make sure your existing systems are compatible for vsphere 4 you can use vmotion to move your vm's as resource planning model inside VI 3.5.

-> I recommend setting up HA once you have migrated all your site to vsphere 4 so you are not duplicating the time and work.
Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

Deepak KosarajuDevOps EngineerCommented:
Migration is a heavy work ahead you have to plan everything ahead from CPU,storage & network make sure you take all the advantage of vsphere engine.
Raymo12Author Commented:
Thanks guys.

My other issue right now is my vcenter server is not 64 bit. I was thinking of creating a new Vcenter server with vsphere 4 then bringing in the dev environment first then the others. I have two more servers designated as new ESX hosts as well.

Do you suggest 4.0 or 4.1? I'm going to keep this question open for a while as I'm sure I'll have a few more questions.
Danny McDanielClinical Systems AnalystCommented:
there is a KB for a known issue when upgrading your VC from 2.5 to 4.1, so make sure to check it out.  There's a link to it on the download page with a SQL script to run and check your db.

They changed the editions/features matrix from 4.0 to 4.1 so you may get more features available to you by going to 4.1.
Deepak KosarajuDevOps EngineerCommented:
Better keep question specific to this post, because if you have more question better open a separate question to cover those. Its the best suggestion getting more experts comment on your question: better keep question title as VMWare Migration 3.5 - 4 Best Practices...
Some knowledge base articles you might refer to:
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.