?
Solved

VM virtual servers excessive reboot times

Posted on 2010-09-17
17
Medium Priority
?
850 Views
Last Modified: 2012-08-13
We're relatively new to VMWare.  We have two physical boxes running that latest version of free ESXi.  The servers are new HP Proliant ML350 G6's configured with RAID-5.  Plenty of horsepower I would think.  32GB RAM.  Each physical box is hosting 3 virtual servers, none of which have excessive load; i.e. they're basically file servers.  The virtual servers are all running Win2K3 R2 and each is allocated 4GB of RAM.

We've noticed that if we initiate a Shutdown/Restart on a virtual server, it takes upwards of 30 minutes to finish the reboot cycle.  Compare this with a similar Windows server, non-VM, that would take 4-5 minutes.  If we pull up the console via the vSphere Client, we'll stare at a gray screen for 20+ minutes.  Often times we end up powering off the instance in question just because it takes too long to wait for it to finish.  Both physical boxes and all six virtual servers are exhibiting these symptoms.  I'm thinking we must be missing some VM-101 setting that we're unaware of.

Any ideas??
0
Comment
Question by:SBSIAdmin
  • 7
  • 5
  • 3
  • +2
17 Comments
 
LVL 28

Accepted Solution

by:
bgoering earned 666 total points
ID: 33706197
Have you installed the vmware tools in all of your guests?
0
 

Author Comment

by:SBSIAdmin
ID: 33706201
Yes.  We originally thought that was it, but no luck.
0
 
LVL 28

Assisted Solution

by:bgoering
bgoering earned 666 total points
ID: 33706259
Tell me a bit more about your environment. How were these vms created? Were they a P2V from a physical box? Have you adjusted the RAM size? For an experiment drop one to 1 or 2GB of RAM and see if it boots faster. If it does adjust it back to the 4GB and it should be ok going forward. It is a bit inexplicable why that works, but it has been an occassional bug in VMware.

Another thing to look at is your RAID controller and Disk I/O time. Some of the lower end RAID controllers don't come by default with battery backed write cache. What this means to you is that disk writes are exceptionally slow. You will see long waits generally when powering on while the vmware swap file is created, but also during boot up process as the OS is initializing its paging files and such. You can look at the storage path and storage adapter reports in the performance tab of the client for an idea how long disk latency is. I generally like to see disk latency 20 ms or less.
0
2017 Webroot Threat Report

MSPs: Get the facts you need to protect your clients.
The 2017 Webroot Threat Report provides a uniquely insightful global view into the analysis and discoveries made by the Webroot® Threat Intelligence Platform to provide insights on key trends and risks as seen by our users.

 

Author Comment

by:SBSIAdmin
ID: 33706338
Thanks for the response bgoering.  Those sound like some promising suggestions.  It's about 10pm EST for me right now and I may not get to your suggested changes until Monday.
The servers were native builds as opposed to P2V's.
I'll try the memory trick and also get the specs on the RAID controller.  I'm guessing the RAID controller is relatively base since it's the base controller that you can get with the server on the motherboard.
I'm looking in the vSphere Client and don't see where to obtain the disk latency numbers.  On [Performance] I can select [Disk], but I don't see anything for storage path or storage adapter.
I did just peruse the [Events] tab and was reminded of a message that I had seen in the past.
Message from VM1:
Insufficient video RAM. The maximum resolution
of the virtual machine will be limited to 1176x885
at 16 bits per pixel. To use the configured
maximum resolution of 2360x1770 at 16 bits per
pixel, increase the amount of video RAM allocated
to this virtual machine by setting svga.vramSize=
"16708800" in the virtual machine's configuration
file.

I wonder if this could be adding to the reboot times.
0
 
LVL 16

Expert Comment

by:Danny McDaniel
ID: 33706457
Did you set a memory limit (edit settings | resources, click on memory)?  If so, change it back to unlimited.

This sounds more like a memory contention/limitation issue rather than a IO issue.  You might also try uninstalling VMware tools to see if the balloon driver or another component is interfering somehow.  (not sure from your earlier reply if you meant that you hadn't had tools installed or had just thought to check that they were installed)

Are you over-committed on memory?  What is your host memory consumption looking like?
0
 
LVL 28

Expert Comment

by:bgoering
ID: 33707752
No, that is just an informational message and does not indicate a problem. If you want to make the message go away take a look at this knowledge base article: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1024990
0
 
LVL 10

Assisted Solution

by:Justin C
Justin C earned 334 total points
ID: 33707797
Within your Windows VMs, ensure that the "Clear page file on shutdown" security setting is not set to "Enabled".

This setting will cause Windows to zero out the page file on shutdown/reboot, and with a system with a large amount of memory and a correspondingly large page file configured this can take a very long time.  
0
 
LVL 26

Expert Comment

by:lnkevin
ID: 33720538
Where are the vms located? Are they on local disks or SAN?
Most of the time, the delay on vms happened at the disk i/o. If you have SAN, try to move your SAN card to a different slots on the higher number. If it's local storage, you may want to check the RAID and HD to ensure everything is OK. Next, run esxcfg-rescan to refresh the connection to your disks.

K
0
 

Author Comment

by:SBSIAdmin
ID: 33784513
I will admit, I posted this question on behalf of our senior level tech who had been the point person for our VM installs.  He was a newbie as well.  As it turns out, he has left our organization so now our two VM servers are my responsibility now.  Thus the delay since the last posting.
I just did a full reboot on one of our physical boxes.  I wanted to peruse the BIOS because I had also heard that their may be a BIOS setting that needs to be adjusted if the box will be running VM.  I saw a couple entries about memory and Intel virtualization, but nothing jumped out at me about VM per se.
As the machine was posting, I notice a message about there being no battery backup on the RAID controller.  It seemed to indicate that it could be added, but it wasn't installed by default.  There was a previous comment about RAID controllers and battery-backed write cache.  There was also a BIOS setting for enabling write caching.  That's currently disabled and it comes with a warning about potentially losing data if there's a power outage, so I haven't enabled that.  I wondering if I should persue getting the battery module for my RAID controller.
0
 
LVL 16

Expert Comment

by:Danny McDaniel
ID: 33784617
Yeah, not having the battery backed write cache enabled definitely does negatively affect storage performance.
0
 
LVL 28

Expert Comment

by:bgoering
ID: 33784697
Yes, if the guests are hosted on the local RAID the addition of battery backed cache will make a huge improvement for disk writes. If the long delay is on the shutdown part of the reboot process, cosider what BloodRed indicated above about the clear pagefile setting - that would add a large amount of time to the shutdown process.

If you shutdown and power off one of these vms, how long does it take to power on and come up?
0
 

Author Comment

by:SBSIAdmin
ID: 33817738
It was less than a year ago that we set our page files to clear on all servers as a result of a recommendation from our security consulting firm.  I tried turning that off to see if it would make a difference on the VM servers, but it hasn't.
I was unable to find a specific settings in BIOS that would "enable" the server to be a hypervisor.
I guess that leaves me with the battery backed cache option.  I'll have to obtain the exact model of RAID controller, confirm capabilities, and get a price.
0
 
LVL 16

Expert Comment

by:Danny McDaniel
ID: 33819139
you could give some vm's a full memory reservation and see if that makes a difference, too.  You'll still need to read from disk but it will preclude the need for a swap file so it would be a decent test to see if you're going down the right path before laying out the money.
0
 

Author Comment

by:SBSIAdmin
ID: 33837049
I ran a couple time tests tonight to get real numbers.  I rebooted one of the server instances on each of our two ESX boxes tonight.  Both were quite consistent.  11 minutes to shutdown and less than 2 minutes to boot up.  This test was done with clearing pagefile disabled.
Then I reset the GPO to enable clearing of the pagefile on shutdown.  Ran gpupdate /force and rebooted again.  The shutdown and restart times were just about the same.  That leads me to believe that the pagefile isn't significantly impacting the process.
I'm not certain what a "full memory reservation" means.  The physical boxes have 16GB of RAM in them.  One ESX is running 3 virtual servers and the other is running 2.  Each virtual server has been allocated 4GB of RAM.  So each virtual is running 4GB leaving 4 or 8GB for the hypervisor.
I'm kind of leaning back to the battery backed cache.  Maybe the shutdown process is so disk intensive that we feel the effects, but daily usage as a file server or domain controller doesn't push the disk enough to notice it??
0
 
LVL 28

Expert Comment

by:bgoering
ID: 33837119
Double check to be sure the GPO update took - take a look at

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management
Value Name: ClearPageFileAtShutdown
Value Type: REG_DWORD

If the Value is 1, the pagefile will be cleared, if 0 the page file will not be cleared.

In any event in the long run to achieve satisfactory performance you will want to be able to configure write-back (rather than write-through) cache on you raid controller. I would definately get that upgrade.

Two minutes sounds reasonable for a startup time.

So far as the full memory reservation - I really don't recommend using that. It is generally better to let ESX manage the memory unless you have very special cases of critical workloads. To set it to test, go into edit settings on your vm, click the resources tab and there will be a place to set a reservation. For a full reservation change it to the amount of memory you have allocated to the vm. What this buys you is that ESX doesn't have to create a swap file to back the ram on that virtual machine. That is pretty much a low overhead activity unless your are serverly memory constrained on your host, and it doesn't sound like that is the case. In any event - the overhead of a swap file isn't really incurred until such time as ESX has to swap out memory pages to disk. When that happens there are two writes to disk, one to zero the area, the other to write the memory contents so that physical ram can be allocated to another host.

Good Luck
0
 

Author Comment

by:SBSIAdmin
ID: 34161083
I very much appreciate all of the responses, but unfortunately, none have helped to this point.  My next step is going to be battery-backed cache, but that's going to take some time, purchase, etc.  For the time being I'm going to close this question.  Again, thanks for all the responses.
0
 

Author Closing Comment

by:SBSIAdmin
ID: 34161111
Unfortunately, we still have the issue, but I'm not able to continue working on it at this time.  I need to get the proper part number for our hardware, determine the cost, get budget for it, make the purchase, install it, yada, yada.  There's no point in keeping the question open at this time.  Responses were great, but nothing I tried worked.  The last item to try requires more planning.
0

Featured Post

Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When converting a physical machine to a virtual machine using VMware vCenter Converter Standalone or vCenter Converter Enterprise, if an adapter type is not selected during the initial customization the resulting virtual machine may contain an IDE d…
In this article, I will show you HOW TO: Install VMware Tools for Windows on a VMware Windows virtual machine on a VMware vSphere Hypervisor 6.5 (ESXi 6.5) Host Server, using the VMware Host Client. The virtual machine has Windows Server 2016 instal…
Teach the user how to configure vSphere Replication and how to protect and recover VMs Open vSphere Web Client: Verify vsphere Replication is enabled: Enable vSphere Replication for a virtual machine: Verify replicated VM is created: Recover replica…
This video shows you how to use a vSphere client to connect to your ESX host as the root user. Demonstrates the basic connection of bypassing certification set up. Demonstrates how to access the traditional view to begin managing your virtual mac…
Suggested Courses

809 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question