VMWare Environment Very Unstable

Wow, where to start.

I'm inheriting a network that is running VMWare ESX on three physical HP Proliant DL380 Servers.  

Server one has dual 3.6 Ghz Intel Xeon processors with 12 Gb of RAM.  It's running 1 instance of Windows Server 2003 R2 SP2 with Microsoft Exchange 2003 Standard Edition.  As well as 4 instances of Windows XP Pro for remote salespeople to access the Made2Manage client.

Server two has dual 3.2 Ghz Intel Xeon processors with 12 Gb of RAM.  It's running 1 instance of Windows Server 2000 SP4 and is used exclusively for the Made2Manage database and SQL engine.

Server three has dual 3.6 Ghz Intel Xeon process with 4 Gb of RAM.  It's running 1 instance of Windows Server 2003 R2 SP2 and handles all data storage and printing for the network.  It also has 1 instance of Windows XP Pro for the timeclock application.

As an added bonus we have an HP Modular Storage Array with (12) 750Gb drives.  All three server use the SAN for the data drives with the third server having a full terrabyte for storage.

Here's the problem.  The first and third servers go down almost every day.  They crash and when they try to come back up the give VMWare kernel errors.  After I reboot them each half a dozen times they will finally boot, but then they are super slow for the next 6 hours after they come back up.

I'm very experience with networks and servers, but I've never dealt with production servers running VMWare ESX and I'm not sure where to start to even look for a solution.  The admin that was here before told them that the SAN that he spec'd and purchased was not completely compatible with VMWare, but he had been telling them that for the past 4 months.

This company is experiencing daily downtime of 1 to 4 hours and I need to get this to stop urgently.  Any and all help is greatly appreciated.

swolfersbergerAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

65tdRetiredCommented:
How many nics/perserver?
How are they allocated re the service console and vmkernal and virtual machine?
How are the Server connected to the SAN, using hba's or iSCSI?
0
swolfersbergerAuthor Commented:
2 NIC's per server

I'm not sure what you're looking for in regards to allocation.  Could you be more specific (or maybe point me in the right direction for the answer)?

iSCSI
0
Paul SolovyovskySenior IT AdvisorCommented:
I would rebuild the environment.  You can vmotion the virtual machine to other hosts and rebuild each server one by one so that you know that you have a clean environment. Once each server is rebuilt you can remove and add into virtual center.  


The other issue you may be experiencing is that it looks like your HP MSA is using SATA drives (most likely) since I have seen too many 750GB SCSI or SATA drives.  You didn't mention model number of the MSA and whether you're using ISCSI or FC but you should troubleshoot I/O on it as well by using an application such as IOMETER.

Hope this helps
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

robocatCommented:

Are you running the latest vmware patches on these machines ? Use the update manager to check if patches are needed and apply if necessary.

Are you running 3.5 update 1, 2 or 3 ? Perhaps you should consider upgrading the machines to update 3 + latest patches.
0
swolfersbergerAuthor Commented:
The version is 3.5.0, 82663

The MSA is an HP MSA 1500 CS with SATA drives and it is accessed with iSCSI.
0
swolfersbergerAuthor Commented:
My only concern with rebuilding the environment is my lack of experience with the VMWare product.  You're correct that it would be a clean install, but I don't like to have my first installation happen in a production environment.  I would hate to make it worse.
0
Paul SolovyovskySenior IT AdvisorCommented:
See if you can get a consultant to assist with this issue since it's causing site downtime it may be the best course of action.  Not only can someone qualified take a look at your environment and baseline your issues but if you are part of the project you can get a lot of hands on experience with the product and ask questions as you go along.  Once the situation is stabilized you can obtain futher VMWare training as needed.

My $.02
0
robocatCommented:

I concur with paulsolov, and let a consultant take a look at this.

You're still running ESX 3.5 update 1, so most likely nobody ever took the time to install patches either. Any unpatched system can have a lot of serious issues.

Together with the consultant you can upgrade the system to the lastest release and apply all patches. Then and only then re-evaluate system stability.

0
swolfersbergerAuthor Commented:
If what the previous guy said about the MSA being incompatible with VMWare is true, then would it make the most sense to remove the VMWare?
0
Paul SolovyovskySenior IT AdvisorCommented:
I haven't seen an MSA that wasn't compatible.  It may not be on the VMWare HCL but I have used everything from MSA1000 to MSA 2000 series

What model do you have?
0
swolfersbergerAuthor Commented:
The memory on server three may be my issue.  It only has 4 Gb of RAM and according to the Infrastructure Manager it's running at around 92% memory utilization.

Is there a good resource that give me a good understanding of what takes place when you migrate a virtual machine?

I just want to get a good feel for how long it takes and if I have to have everyone off of the system.
0
Paul SolovyovskySenior IT AdvisorCommented:
You can use Solarwinds Free VMWare Tool

http://www.solarwinds.com/products/freetools/vm_monitor.aspx

or download a trial copy of Vizioncore VFoglight and that will give you anything and everything you want to know about your environmnet.  If you find what the issue within the trial period it will be good, if you like you can always buy it.
0
65tdRetiredCommented:
How many virtual's on server 3 and how much ram is allocated to the VM(s)?
0
swolfersbergerAuthor Commented:
Two virtuals on server 3 (one server and one workstation) with 3.50Gb out of 3.75Gb in use.
0
robocatCommented:

Did you try to install the most recent ESX patches for this system ? Does this solve your problem ?

0
swolfersbergerAuthor Commented:
Basically, I was able to stabilize the system by changing how the vm's were allocated and by removing the vm's connectionsfrom one LUN in particular.  It appears that the MSA 1510i is not compatible with ESX 3.5 and HP is going to RMA it and swap us out to a different SAN.  Thanks for all your help.
0
Paul SolovyovskySenior IT AdvisorCommented:
Actually the MSA 1510i is fully compatible with ESX 3.5.  I have installed several of these, just finished one yesterday.  There are some things that you need to be aware of such as no support for dual controllers  with MPIO (hopefully to be resolved in firmware release update) but otherwise it works just fine.

My $.02
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Storage

From novice to tech pro — start learning today.