VMWare ESX Environment Design

Good Morning All,

I am attempting to redesign our current VMWare environment and was looking for some guidance from some other VMWare experts.  Here is out current set-up.
3- HP Blades all running VMWare ESX 3.5 ( Each server has 8 cores with 16GB RAM)
1- Hitachi SAN with 8 LUNS (connected via fibre channel)

2- DELL 2850 both running ESX 3.0 (each has a quad core with 8GB RAM)
Connected to the HItachi SAN (via fibre channel)

1- Virtual Center instance controlling all servers
We just purchased 2- HP Blades (dual Proc Quad Cores with 32GB RAM) and will most likely upgrade 2 of our existing blades to 32GB of RAM. I have 2 questions as to how I should structure the environment with the 2 new servers coming in.  I am the new VMWare Admin and the previous Admin and I disagree as to how the environment should be set-up.

A. Should I put all 5 Blades together into 1 big cluster and then split up the environment through resource pools?  Currently I have a DEV and PRD resource pool.  DEV gets normal resources and a smaller reserve of CPU and RAM and PRD gets higher shares and more CPU and RAM reserved.

B. I could have keep the servers grouped in clusters 3-PRD 2-DEV and then have seperate LUNS for DEV and PRD and then only create resource pools based on high value servers.

The previous admin is worried because we used VCB in the past and had issues.  When we ran the nightly backups, we completely clogged our VMWare network and had to ditch that backup option. (2 nics on each blade.  One for VM's, the other for the host)  He is afraid that with all 5 hosts able to see all available LUNs, that the disk I/O might cause problems on our SAN.  Is this something we should worry about?

Thanks for your help everyone.  Any and all ideas are greatly appreciated!

Who is Participating?
nappy_dThere are a 1000 ways to skin the technology cat.Commented:
Well to a certain degree, he is right.  Depending on what your applications are doing, disk IO can be an issue.  

What we did in our planning phase was to create LUNs based on physical disks.  As an example tray 1 of our SAN contains 12, 400GB SAS drives.  We then created smaller disk groups(6 drives) and assigned LUNs based on what we thought IO performance would be.  This way we seemed to have reduced performance hits.

We have one VC server that runs VCB scripts for Arcserve.  Thought we don't have compaints for our backup window, if may be worth it for your environment to have a secondary SAN for snap copies and then you could backup from the seconday SAN.

Now are you connecting to your SAN via iSCSI or Fibre?
vmwarun - ArunCommented:
My suggestion would be to divide the PRD and DEV Hosts into 2 Clusters and map the LUNs accordingly.

With respect to your VCB issue, what mode did you use for Backup (SAN Mode or LAN Mode)
juseldingAuthor Commented:
We were using LAN mode to backup all of the servers.  We had a standalone VCB server that we ran backups across our network.

Can I ask why you would create 2 separate clusters as opposed to setting up resource pools?

Improve Your Query Performance Tuning

In this FREE six-day email course, you'll learn from Janis Griffin, Database Performance Evangelist. She'll teach 12 steps that you can use to optimize your queries as much as possible and see measurable results in your work. Get started today!


My two cents (at least it will start the discussions :))

I'll throw everything in a unique cluster with the proper resources pools (This might anyway might not help for V-motion if the CPU are not compatible).
I'll also rejoin all the LUN (en splitting the storage in a set of 450Gb LUN). That will provide you with much more flexibility and much better ROI.

Be very careful when you present the LUN to the host. Each LUN must be presented with the same ID to all the hosts.

I'm using VCB sinds month without problem sinds the  installation.
the installation has be shaky thoughs. You have to be very careful when the LUN are presented to the VCB server.
Bit late, discussion was started already :)
vmwarun - ArunCommented:
Since you had Fibre Channel SAN running, you could have used the SAN Mode to backup your VMs instead of LAN Mode since it uses the Network Infrastructure to backup VMs which is the primary reason as to why you experienced problems with VCB initially.

I suggested separate clusters keeping ease of administration in mind. I always suggest Resource Pools when there is only one department on the basis of which you provide resources to your VMs based on requirement.
juseldingAuthor Commented:
nappy_d, that makes a lot of sense and thanks for you input.  We are connecting to our SAN via Fibre with a mix of SATA and fibre disks.  If we are breaking up our LUNS to control IO performance, would it make much a differece to split our environment into 2 clusters and divide up LUNS to servers?  If all hosts have access to all LUNS, does it increase or COULD it IO hits and thus degrade performance, or does it only matter which VM's are stored on which LUNS?  What about for security, is that even a concern?

vmwarun - ArunCommented:
The I/O hits actually are dependent on the type of VMs which are hosted in the ESX Hosts.

Hosting multiple SQL DB VMs on the same Host having visibility to the same LUN might be bad for I/O.

As far as you mask your LUNs using Zoning or LUN Masking, security might not be a problem.
nappy_dThere are a 1000 ways to skin the technology cat.Commented:
Yes if ALL hosts are accessing all LUNs simultaneously and performing read/write operations you will see a performance hit and thus why I would suggest you beak up your LUN assignments and create smaller disk groups.

Also it would not be a bad idea since you have the resources(it seems like) to create with ONE VC two separate cluster; one for DEV and one for Prod.  This would help reduce constraints for if a node in the cluster was to fail.

So I would think about creating LUNs similar to this:

Disk group 1 has 4 LUNs
  • LUNs 1,2,3,4 gets assisgned to the DEV cluster
  • LUNs 1 & 2 gets assigned to DEV guest 1 & 2
  • LUNs 2 & 33 gets assigned to DEV guest 3 & 4
juseldingAuthor Commented:

I just want to make sure I am following you.  When you say node, I assume you are referring to a host or any piece of my bladecenter that could potentially fail?  In addition, you are suggesting that splitting my environment into 2 clusters as opposed to 1 large cluster with resource pools would also be a better option?

PRD - 3 hosts (32GB per server 8 cores per server)
SAN Fibre connected

DEV - 2 hosts (16GB one/32GB another with 8 cores per server)
SAN Fibre Connected

I already have 8 LUNS and plan to requisition 2 more.  4 to DEV, 6 to PRD.

Thanks for the clarification.


I see a big difference between cluster and storage.

the cluster / resource Pools will act of the CPU/Memory usage.
The storage will be driven by the kind of VM that runs from a given LUN (as said before, a LUN hosting SQL server will be much more stressed that a LUN hosting Web Service. Does not really matter what host run a a given VM).

I still think that with a total of 6 host, a unique cluster will be better. You can still use DRS settings to pinpoint some machine to given host.

With HA properly set (= not restarting the DEV environment) you have a higer fail-over capacity.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.