VMWare Guest Intermittently Freezing


We have 3 servers, 3 seperate sites all running IBM hardware.

VMWare VSphere 5.1

Windows 2011 SBS Server is the operating system Guest. So the hardware at all locatons is slightly different but the issue is very similar.

They intermittently freeze, you can view the console but you can do nothing, the only solution is to reboot the VM.

Very fustrating for the client, can happen 3 times a week or 2 times in a month. Its really intermittent.

Would love to resolve this.

Tried replacing network cards, reinstalling drivers, reinstalling VMWare tools and this still happens.


compdigit44Connect With a Mentor Commented:
When the VM's freeze are you able to ping them or access them via RDP?
Any windows updates installed recently?
Which video display driver are using using?


Have you check the logs on the host / Vm for errors?  What about the windows event log?
Michael-BestConnect With a Mentor Commented:
In virtual machine settings, Click hardware tab to view the default devices added to the virtual machine.
Remove the devices that you do not use (like CD/DVD (IDE), floppy, and sound card) that by default have "auto detect" in the summary column.
It will try to detect those devices automatically and the unnecessary processing can cause the application to freeze.

Hope this solves the problem.
If all three systems freeze at the same time, start with what might be common to all of them: network, shared storage, etc.

If it happens to all three, but not at the same time, then it could be some stress on the VMware system, like doing a large backup or a shortage of some resource like CPU or memory, or some storage latency.

It could also be something with the install. Perhaps SBS is misconfigured somewhere or is not configured with enough resources to do it's job all of the time. Then you need to look at Windows and it's use of resources. Also check event logs for any errors.
elevatecsAuthor Commented:
1 system is a conversion from a working machine which had ZERO issues with the operating system prior and no freezing issues, this is an M4 version.

1 system is a new install onto a new hardware, but an M3 version of an IBM server.

I have done this process 25 times, these 2 servers in particular are very problematic. Like i said can be 1-2 times per week, of 1 time per day.

Nothing seems to trigger it, nothing scheduled.. Just seems to happen randomly. It has happened 3 times this weekend on the conversion server. Customers report that they are working, everything comes to a slow grinding holt and then it freezes and we need to restart the guest.

We have just reinstalled the VM Nic and we have just applied all windows updates, scouring the logs now for answers.

Both version are ESX Version 5.1

It is something very strange, some kind of weird memory leak or something, really hard to diagnose :(.

If you have some third part ESXi monitoring program check that for resources spikes. If not, then use the built-in ones in the vSphere client. Check over a time frame such as a week looking for spikes in CPU, memory, bandwidth and disk usage. Check this on the hosts as well as the guests.

You can also use the CLI app, esxtop, but it's not as user friendly and the GUI gives you a better picture.

Usually when we've seen serious slowdowns on a guest, it's been on one that's having a snap removed from a VEEAM back up or we've had something else going on in the environment that negatively impacted one or more guests - but we've been able to see it in the Performance tab of the guest or the host.

What kind of storage are you using?  

IF you can get into the IMM of the hosts, it will report hardware issues to you for many of the components.
elevatecsAuthor Commented:
Stoarge is a locally attached Raid 10/5 array on the IBM server, it is all locally attached.

No raid issues, hardware issues. We were going to run a DSA diagnostic over the server if the issue still persisted.

We have GFI Max monitoring the ESX host, so i will try to get you the information requested, but i dont think it is a performance issue.

Just out of curiosity is there an amount of left over space on the datastore that is a recommended minimum?

We have 30GB Free for one ESX which is using 16GB for swap space, i just was not sure if there is a minimum needed.

it is one guest on one ESX and there is heaps of resources available so i dont think it is a resource issue... but happy to look into it.
jhyieslaConnect With a Mentor Commented:
I don't think there's a minimum space required as long as there's enough for both the guest and the host to do their thing.  DSA may tell you something, but since it's happening on three different hosts, it's not likely that it's a hardware failure of some kind.

If you have more than one guest on each host and it's only the one guest that's having an issue, perhaps look at the Windows side, especially if the guests with the issues are all SBS.

I'd not discount resources used but the troublesome guests just yet, although I understand that you have plenty of resources available.

Does the slowness last long enough that you can look at resource usage in real time?  Do you have the guests set up so that you can remote control to them?  If you get to a point where it appears that you can't get to the guest with the console, check to make sure you can access other guests via the console.  Then see if you can remote control to the guests.
elevatecsAuthor Commented:
Unfortunately it is one guest per one ESX server, so doesn't help the cause.

Unfortunately it happens so intermittently and the monitoring doesn't tell me it is offline until around 5 minutes after its not responding.

But the clients tell me that everything gets really slow then it stops... how reliable this information is, is difficult to tell.

Anything else anyone can suggest to make this work, or to eliminate?

Traps to look out for after conversion etc?

elevatecsAuthor Commented:
An array of windows updates and updating all drivers / tools seems to have stopped this issue from occuring, so a mixture of everything seemed to resolve this.

