Link to home
Start Free TrialLog in
Avatar of Joe
JoeFlag for United States of America

asked on

VMWare ESX Environment Design

Good Morning All,

I am attempting to redesign our current VMWare environment and was looking for some guidance from some other VMWare experts.  Here is out current set-up.
______________________________________________________________________
Production:
3- HP Blades all running VMWare ESX 3.5 ( Each server has 8 cores with 16GB RAM)
1- Hitachi SAN with 8 LUNS (connected via fibre channel)

Development
2- DELL 2850 both running ESX 3.0 (each has a quad core with 8GB RAM)
Connected to the HItachi SAN (via fibre channel)

1- Virtual Center instance controlling all servers
_____________________________________________________________________
We just purchased 2- HP Blades (dual Proc Quad Cores with 32GB RAM) and will most likely upgrade 2 of our existing blades to 32GB of RAM. I have 2 questions as to how I should structure the environment with the 2 new servers coming in.  I am the new VMWare Admin and the previous Admin and I disagree as to how the environment should be set-up.

A. Should I put all 5 Blades together into 1 big cluster and then split up the environment through resource pools?  Currently I have a DEV and PRD resource pool.  DEV gets normal resources and a smaller reserve of CPU and RAM and PRD gets higher shares and more CPU and RAM reserved.

B. I could have keep the servers grouped in clusters 3-PRD 2-DEV and then have seperate LUNS for DEV and PRD and then only create resource pools based on high value servers.

The previous admin is worried because we used VCB in the past and had issues.  When we ran the nightly backups, we completely clogged our VMWare network and had to ditch that backup option. (2 nics on each blade.  One for VM's, the other for the host)  He is afraid that with all 5 hosts able to see all available LUNs, that the disk I/O might cause problems on our SAN.  Is this something we should worry about?

Thanks for your help everyone.  Any and all ideas are greatly appreciated!

Joe
SOLUTION
Avatar of vmwarun - Arun
vmwarun - Arun
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Joe

ASKER

We were using LAN mode to backup all of the servers.  We had a standalone VCB server that we ran backups across our network.

Can I ask why you would create 2 separate clusters as opposed to setting up resource pools?

Joe
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Froggy_chris
Froggy_chris

Bit late, discussion was started already :)
Since you had Fibre Channel SAN running, you could have used the SAN Mode to backup your VMs instead of LAN Mode since it uses the Network Infrastructure to backup VMs which is the primary reason as to why you experienced problems with VCB initially.

I suggested separate clusters keeping ease of administration in mind. I always suggest Resource Pools when there is only one department on the basis of which you provide resources to your VMs based on requirement.
ASKER CERTIFIED SOLUTION
Avatar of Irwin W.
Irwin W.
Flag of Canada image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Joe

ASKER

nappy_d, that makes a lot of sense and thanks for you input.  We are connecting to our SAN via Fibre with a mix of SATA and fibre disks.  If we are breaking up our LUNS to control IO performance, would it make much a differece to split our environment into 2 clusters and divide up LUNS to servers?  If all hosts have access to all LUNS, does it increase or COULD it IO hits and thus degrade performance, or does it only matter which VM's are stored on which LUNS?  What about for security, is that even a concern?

Joe
The I/O hits actually are dependent on the type of VMs which are hosted in the ESX Hosts.

Hosting multiple SQL DB VMs on the same Host having visibility to the same LUN might be bad for I/O.

As far as you mask your LUNs using Zoning or LUN Masking, security might not be a problem.
Yes if ALL hosts are accessing all LUNs simultaneously and performing read/write operations you will see a performance hit and thus why I would suggest you beak up your LUN assignments and create smaller disk groups.

Also it would not be a bad idea since you have the resources(it seems like) to create with ONE VC two separate cluster; one for DEV and one for Prod.  This would help reduce constraints for if a node in the cluster was to fail.

So I would think about creating LUNs similar to this:

Disk group 1 has 4 LUNs
  • LUNs 1,2,3,4 gets assisgned to the DEV cluster
  • LUNs 1 & 2 gets assigned to DEV guest 1 & 2
  • LUNs 2 & 33 gets assigned to DEV guest 3 & 4
Avatar of Joe

ASKER

nappy_d:

I just want to make sure I am following you.  When you say node, I assume you are referring to a host or any piece of my bladecenter that could potentially fail?  In addition, you are suggesting that splitting my environment into 2 clusters as opposed to 1 large cluster with resource pools would also be a better option?

PRD - 3 hosts (32GB per server 8 cores per server)
SAN Fibre connected

DEV - 2 hosts (16GB one/32GB another with 8 cores per server)
SAN Fibre Connected

I already have 8 LUNS and plan to requisition 2 more.  4 to DEV, 6 to PRD.

Thanks for the clarification.

Joe



I see a big difference between cluster and storage.

the cluster / resource Pools will act of the CPU/Memory usage.
The storage will be driven by the kind of VM that runs from a given LUN (as said before, a LUN hosting SQL server will be much more stressed that a LUN hosting Web Service. Does not really matter what host run a a given VM).

I still think that with a total of 6 host, a unique cluster will be better. You can still use DRS settings to pinpoint some machine to given host.

With HA properly set (= not restarting the DEV environment) you have a higer fail-over capacity.
Chris