Link to home
Start Free TrialLog in
Avatar of David Haycox
David HaycoxFlag for United Kingdom of Great Britain and Northern Ireland

asked on

"Remove all snapshots" very slow after shutting down VM from the OS

We have a VMware host running ESXi v5.5.0.  A (Server 2012) VM has many snapshots.  I ran "remove all snapshots" while the VM was running.  This got to around 50% in about an hour.  I then foolishly shut down the OS from within Windows in the hope that this would speed up the removal.  In fact it appears to have now slowed to a crawl (it's been on 56% for over two hours) with the VM is showing as still powered up, with a blank screen (because you can't make any changes to VMs while snapshots are being consolidated).

I tried to cancel the snapshot removal which didn't appear to make a difference, although the option is now greyed out.

Is there any way to either reboot the VM or speed up / abort the snapshot removal - or is it just a matter of waiting?  If the latter, is the percentage complete figure likely to be consistent and accurate, or might it just suddenly jump to 100%

Thanks in advance for any suggestions.
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

We have a VMware host running ESXi v5.5.0.  A (Server 2012) VM has many snapshots.  I ran "remove all snapshots" while the VM was running.  This got to around 50% in about an hour.  I then foolishly shut down the OS from within Windows in the hope that this would speed up the removal.  In fact it appears to have now slowed to a crawl (it's been on 56% for over two hours) with the VM is showing as still powered up, with a blank screen (because you can't make any changes to VMs while snapshots are being consolidated).

Oh Dear!

The Answer is Be Patient!

Do not mess, meddle, power off, shutdown, restart or fiddle, when you remove a snapshot, otherwise serious corruption can occur or cancel!

I tried to cancel the snapshot removal which didn't appear to make a difference, although the option is now greyed out.

Worse!

Is there any way to either reboot the VM or speed up / abort the snapshot removal - or is it just a matter of waiting?  If the latter, is the percentage complete figure likely to be consistent and accurate, or might it just suddenly jump to 100%

Now you've been told off, let's see if we can remedy the situation!

Be Patient.....

Can you upload a screenshot of the folder.... for me to examine...

Now that you have fiddled, you have have caused it to HANG....

but lets get a screenshot...

Grab a coffee, or beer, relax chill out, read my EE Article, whilst I look at your snapshot...

HOW TO: VMware Snapshots :- Be Patient
Avatar of David Haycox

ASKER

Hi Andrew,

I just found your article - a little too late, alas.  I consider myself told off (as if I hadn't chided myself enough already)!

Many thanks for the quick response.  Here are the folder contents:

User generated imageUser generated image
Before you say anything, this is a replica server (using Veeam) and was never intended to be used to run VMs... until the production server died and we just switched on the replica without first clearing its 25 snapshots...!

I can confirm that removing all snapshots on another (smaller, powered-off) VM on the same host completes successfully in only a few minutes.

Thanks again.
Okay, don't have to study it too much.... the reason it's slow, is because of the many snapshots, there is a magic limit to the max number of snapshot a VM can have, and then the VM would fail, and not power on, I was told by VMware this is 29, but I have seen VMs, with up to 72 snapshots and still running!

anyway....how long has this been running, and can you see any changes in the datastore, e.g. snapshots disappearing ?
It has been running for about 4 hours 40 minutes.  The progress bar was moving well until I shut down the VM's OS (about an hour in).  It has moved since then, but only by a couple of percent and not at all in the last 3 hours.

I have been monitoring the datastore files, but only for about the last hour or so.  There were 86 files then, and the same now.  That's 75 for the 25 snapshots of the 3 virtual disks, plus 11 assorted others?  Hmm.
The other VMs have files like "VMname-snapshotXXX.vmsn" in them (one for each snapshot by the look of it), but these are  missing from the troublesome one.  Is that expected?  Thanks.
ASKER CERTIFIED SOLUTION
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Sounds like a plan.  If I can just get the SQL database files off, that will be enough!  Will post back when have news.  Thanks again.
Restarted management agents (that was easy, have iLO access so did it from the console).  No sign of snapshot removal but VM still showing powered on.  Controls to stop available but still in progress after several minutes.  May have to reboot host?
What is the state of the VM, on or off ?

can you perform any tasks on the VM ?

or does it state task in progress, or not in this state!
VM state was on, tasks can be performed and I tried to power it off.  States task in progress.  Snapshot manager shows no snapshots (all files still present in folder though).

Anyway, I waiting a few minutes then rebooted.  Was able to power up the VM fine!!  Am copying database files off it manually to just put them back on the original VM (host's mainboard failed).  I reckon that'll be faster than a Veeam recovery, especially with the snapshots being all over place.

You are a hero, sir!  I have been (rightly) chastised and learned a great deal.  I certainly won't be making that mistake again.  Thank you very much, indeed.
do you have vCenter Server, I would assume so ?

Do you need to clean up these snapshots ?
We have vCenter server but it's not managing that particular host (the Essentials licence allows for only three hosts, so it's full).  The host in question here is at a different site and is used mainly just to keep replicas of the VMs on the three hosts managed by the vCenter.

If I get the database files copied off cleanly (so far so good) then we don't need to clean the snapshots - once confirmed it's okay on the original host again I'll just delete it so Veeam can recreate a replica.  If you have anything you can copy and paste about snapshot clean-up I'd be interested, but don't waste any time on it as it looks (fingers crossed) like we're done here.
Two methods....

1. Take a snapshot
2. Wait 60 seconds
3. DELETE ALL in Snapshot Manager
4. Be patient

However, with your situation, and the number of snapshots, I would not do this...

I would use a CLONE, with no vCenter you will have to do at the console or login via SSH, and perform the following...

vmkfstools -i <most recent snapshot file name> /vmfs/volumes/<temp folder name>/<newfilename.vmdk>

this will create a brand new disk and merge all the snapshots, you can then safely add this disk to the VM, and discard the old disks, and ALL the snapshots!

Done....

(you can also use VMware Converter to create a V2V)
Excellent, filed for future reference.  Much obliged.