Solved

Removing virtual machine snapshots in Esx 4.1

Posted on 2013-01-17
56
859 Views
Last Modified: 2016-11-23
I have a virtual Exchange 2003 which is residing on an Equallogic SAN. It had been working fine for a long time but last night was powered down unexpectedly. When I tried to reboot it via vSphere client it gave me a space error regarding the datastore: "Could not power VM: No space left on device."
The SAN volume was showing only about 1% free space whereas the datastore on vSphere was showing close to 40% free space. Why this difference? Aren't they supposed to show the same numbers?
I tried deleting a couple of snapshots on SAN but that didn't help. Then I called Dell tech support. They decided to increase the size of the volume on the RAID. After doing a couple of refreshes on vSphere the datastore space started showing the correct numbers close to those reported by SAN console. WHY?
At that point I tried another power on. It did power but it stayed at 95% for about 10-15 mins. WHY?
Dell techs suggested that I clean snapshots of the machine from vSphere in order not to run into space issues anymore. How can I do that without risking any damage on the virtual machine itself?

Thank you.
0
Comment
Question by:cembi
  • 29
  • 26
56 Comments
 
LVL 7

Expert Comment

by:flaphead_com
ID: 38788576
did you thin provision the SAN luns?
0
 
LVL 118
ID: 38788590
BE PATIENT! Have a cup of coffee, and read my EE Article.

HOW TO: VMware Snapshots :- Be Patient

Please upload a screenshot of the datastore, do not do anything else at this time.

I'll discuss your options.
0
 

Author Comment

by:cembi
ID: 38788793
Datastore screenshot

Here is a screenshot. The client is thick provisioned. The SAN volume has a max size of 1.2TB. There are 175GB free as of right now. How soon can I run into same trouble if no immediate measures are taken? I'd hate to go through last night's stress again soon :-(

Thanks.
0
 
LVL 118
ID: 38788818
okay the VM has 5 snapshots, approx 60GB, can you check if you start Snapshot Manager are they listed?
0
 

Author Comment

by:cembi
ID: 38788879
Running on a snapshot
Snapshot manager
It seems like VM is running on a snapshot, right hancocka?
0
 

Author Comment

by:cembi
ID: 38788902
By the way your article is fantastic and explains it perfectly. Based on it I want to add that I have been experimenting with Veeam Free version. I have it installed on a physical Windows 2003 in my LAN and been using it to backup this virtual Exchange on to a USB external disc. I ran it a couple of times and it lasted about 9 hours and ended with a warning about snapshots. I hope this provides more clues.
0
 
LVL 118
ID: 38788979
The biggest issue with all backup products, is they leave a VM in a snapshotted mode, and unless you regularly check, as part of your VMware Admin Daily checks, or setup alarms, scripts to warn you of snapshots, they can quickly fill up a disk, and the VM fails and stops!

Yes, it is running on a snapshot disk.

Thanks for your kind comments, about the article.

Okay.....a number of ways we can attack this....

but, the bottom line BE PATIENT, deleting the snapshot, and the merger of 60GB into the parent disk, can take hours, minutes or seconds. In your case hours, it depends on the storage system, and how fast, and it will look like it's doing nothing, sit at 95% (like it's hung), and during this time, do not mess, cancel, fiddle, just wal away if will finish!

can we turn off the VM?

it's quicker, does not require any additional disk space, BUT disadvantage, is the VM is out of action until the merge is complete?
0
 

Author Comment

by:cembi
ID: 38789133
It cannot be turned off and it cannot stay down for long unless absolutely necessary. It would only be done in a weekend but I would hate to do it during this weekend.
Is there another alternative? If I don't run Veeam in the meanwhile does it make safer until next weekend when I can follow your advice?  
Thank you.
0
 
LVL 118

Assisted Solution

by:Andrew Hancock (VMware vExpert / EE MVE)
Andrew Hancock (VMware vExpert / EE MVE) earned 500 total points
ID: 38789255
at present performance will be poor because of writes to a snapshot

1. disk I/O will be poor.
2. CPU usage will be higher than normal.

all because of the snapshot, the snapshot will continue to grow, and grow, hour by hour, day by day, you get the general idea, and it will fail, and other VMs will fail if they occupy the same datastore. you will then need to grow the LUN, and keep growing the LUN, the danger, is the large the snapshot, the longer it will take to merge, and then you may not have enough weekend hours to committ the changes.

if you decide to merge the snapshot whilst the VM is ON, which can be done, I'll tell you how, the merge will take longer, storage space will be used, and performance could be worse on the VM.

as this is a realtime Exchange 2003 server, it's difficult decision time.

If you want to go ahead and complete during power-up

1. Select Snapshot, Take Snapshot
2. Check a new snapshot is created on the disk.
3. Check a new snapshot is created in the Snapshot Manager.
4. Click DELETE ALL this will Delete and Merge (Consolidate) ALL the Snapshots!

(untick snapshot memory, not interested in this!)

A word of caution, if you do this while the VM is on, it could take many hours, days, minutes, be patient, and sit it out.
0
 

Author Comment

by:cembi
ID: 38789420
Thank you hancocka. I really appreciate all the details. I think I will have to go with the powered on method. A few last questions and observations:
- Performance is already not optimal, delays in Outlook happen throughout the day.
- After VM was succesfully started Exchange Store and Attendant were not started. I had to start them manually. Could this issue be related to snapshots too?
- At the moment SAN volume is reporting around 200GB free space, datastore is reporting 172.95. Considering this is the only VM on the volume do you think this can wait until next Friday evening when I am thinking I can get it started? Is the datastore space number a good, reliable indicator to tell how fast snapshots are growing?
- Before I start the consolidation process per your directions, how can I create a good, reliable backup of this VM? Right now I am relying on SAN snapshots, Symantec Backup Exec Mailbox/Store backups and NTbackups. Do you think these are good enough for a fallback in case sth goes wrong during consolidation? Running another Veeam backup obviously increases the risk of running out of space, right?

Thanks again.
0
 
LVL 118
ID: 38789469
Yes, CPU high performance issues due to high i/o disk load.

200GB is not alot of space, but there is only day left until weekend, it really does depend on how many users, and traffic flow, eg number of emails.

200GB safe margin I believe until Friday out of hours!

if you are talking next friday eg 18th Jan + 7 dangerous!

problem for your also is using Backup Exec and NT Backup although good backups will flush the logs, and hence snapshot will grow by this amount!?

running another Veeam Job will create another snapshot, and it could then try and merge after backup finished, but Veeam is the Best to Restore from.

can you expand the LuN again, have space?
0
 

Author Comment

by:cembi
ID: 38789556
Datastore space
LUN
No, for some reason the LUN is capped at 1.2TB.
Regarding Veeam as I mentioned previously it ends with a warning about snapshots. It seems to me the warning is about the incapability of deleting the snapshot. That is why I hesitate to rerun it again. Am I correct in these assumptions?

NT Backup runs inside VM and saves locally on the E drive. Backup Exec runs as an agent inside VM too just like it was running when this VM was physical. Do these contribute to snapshot space too?
0
 
LVL 118
ID: 38789569
as for your backups in the VM, yes they do contribute to snapshot growth, because they flush the exchange logs any new writes go into the delta snapshot.
0
 
LVL 118
ID: 38789587
200gb is tight , for a week, but it depends on how busy your exchange server is.

yes, Veeam would complain about snapshot.

I would plan for emergency downtime from tomorrow evening or have no exchange server, and no mail, and having to complete a DR exercise with 2003 and users with no mail.

sorry to be blunt, but Ive seen things go bad with ALL email lost!
0
 

Author Comment

by:cembi
ID: 38789668
I hear you. Thanks again. I think that's what I'll do.
0
 
LVL 118
ID: 38789683
let know when you are starting, and, I can be here to "hold your hand"
0
 

Author Comment

by:cembi
ID: 38790303
From bad to worse. I lost another 55GB of space due to an NTBackup running on schedule. I stopped it and also disabled Symantec Backup for this evening. Space in vSphere now shows as 112GB. Hopefully it will hold until tomorrow evening.
Is there a minimum space requirement for running consolidation of snapshots while powered ON? What would be the best backup approach before running consolidation since Veeam cannot be run? Downloading VM to another disc while it is powered down?
Thanks a ton.
0
 
LVL 118
ID: 38791993
Yes, I did state that using NTBackup and Symantec Backup Exec, would increase the snapshot size, when the logs get flushed!

see http:#a38789569

112GB is very low, if the VM is OFF, you can run Veeam or download ALL the files, of copy to another datastore.
0
 

Author Comment

by:cembi
ID: 38792760
- Would Veeam run faster with the VM OFF? It took around 9 hrs to backup with poer ON.
- If I create another volume of 2TB on SAN I can move this VM on it and start running again?
- At this point with 111.03GB free space reported on ESX is it still an option running consolidation with power ON? My BIG concern is shutting VM down and then finding out that consolidation is taking more than 2 days and be forced to wait without knowing how much longer system will be down. CEO is on email 24 hrs a day.
- If consolidation is started online would space start releasing right away? Is there a risk that during the course of it space is consumed faster than it is released?

Thank you.
0
 

Author Comment

by:cembi
ID: 38793361
Another question: Even if I delete files in Exchange it won't help with space, right? So if I delete 10GB of data in Exchange, it won't release that space and instead it will decrease it by 10GB, is it how it works?
0
 
LVL 118
ID: 38793397
missed that other post.

How is the CEO going to re-act, when he has no email to check, because it's ALL GONE!

The time it takes Veeam Backup to run, is because of the data being transferred from the server to backup location, it will not run much faster with VM off.

Yes, you can create a new datastore, and MOVE the existing VM, this will take time, and can be dangerous with snapshots attached! (there's a warning there!)

If you consolidate online, performance could get worse, additional space will be used, and the danger is you run out of disk space before the process completes, resulting in corrupted snapshots, corrupted VM, and no email server, as the VM will stop, when disk space is used up.

No, if you delete 10GB, the snapshot delta will grow by at least 10GB, because all those changes will be recorded!

(that's whats happening when NT Backup runs, it flushes the logs (deletes), and all those changes are recorded in the delta snapshot).

So the more writes your create, the more the snapshot will grow.

if you stop all access to the mail server, snapshot would stop growing as much, but even keeping a server up on a snapshot grows the snapshot by 100-200MB an hour, just doing nothing!
0
 

Author Comment

by:cembi
ID: 38793438
What would be your guess with regards to running it while powered OFF? Could it last more than 2 days? Also, while it is consolidating could I start the VM at some point since some space would have been released already or is this not possible and once consolidation starts nothing should change?

I can't thank you enough for all this invaluable help.
0
 
LVL 118
ID: 38793526
Once you turn off the VM, and start the consolidation (merge), the VM will be running a task, and no other tasks can be performed, that includes Power On, you will not be able to power it on.

very difficult for me to predict, how long it would take it depends on speed of storage.

it's quite a small snapshot, compared to the most I've seen, if I was to guess, 3-4 hours (maybe!) do not quote me, but if I was scheduling outage, I would go for 24/48 hours.
0
 

Author Comment

by:cembi
ID: 38793545
Awesome. I was worried it may last for days. I will keep you posted. Thanks.
0
 

Author Comment

by:cembi
ID: 38794520
Hi hancocka.

I just realized that I can increase the size of the RAID volume by around 200GB. This was an earlier question posed by you. I doesn't hurt to increase it correct? Is there a short guide on how then increase the size on the datastore itself? What I know is that the VM needs to be powered down, then increase the RAID volume on SAN - go to vSphere Datastore Properties - Increase. What do you think?

I had a chat earlier with a VMWare tech and he said the 108GB should be enough to run snapshot deletion online.

Thanks again.
0
 
LVL 118
ID: 38794584
no it does not hurt to increase lun then datastore

select the datastore, properties, increase
0
 

Author Comment

by:cembi
ID: 38795112
I started the process with the machine ON. It ended after about 10 mins with an error: Remove all snapshots:

PROLIANT-NY
Unable to access file
<unspecified filename> since
it is locked
root
1/18/2013 2:17:19 PM
1/18/2013 2:17:19 PM
1/18/2013 2:26:35 PM

What gives?!
0
 
LVL 118
ID: 38795177
does the datastore look any different, the files?

that's a very unusual error message.

what did you do to get that error message?

no other backup program, Veeam is not running?
0
Integrate social media with email signatures

Is your company active on social media? Do you also use email signatures? Including social media icons in your email signature is a great way to get fans for free. Let all your email users know you’re on social media quickly and easily, in a single click.

 

Author Comment

by:cembi
ID: 38795249
I just followed your steps except that the vm was ON. Yes, there is a new snapshot. Also when I took the snapshot the Exchange was inaccessible by Outlook. Once snapshot was created it came online. Then same thing happened when I started the deletion process. Wxchnage went offline and then came back online when the process failed. What can I try?
0
 
LVL 118
ID: 38795254
okay, so VM was ON.

1. Select Snapshot, Take Snapshot
2. Check a new snapshot is created on the disk.
3. Check a new snapshot is created in the Snapshot Manager.
4. Click DELETE ALL this will Delete and Merge (Consolidate) ALL the Snapshots!


at which point did the error occur?

and is anything listed in Snapshot Manager?
0
 

Author Comment

by:cembi
ID: 38795272
All backups have been disabled. Could it be tha Veeam is still accessing its own snapshots?
0
 

Author Comment

by:cembi
ID: 38795276
After step 4. Once I clicked Delete All the task went to 95% and stayed there for about 10 mins during which time Wxchnage went offline. Then it stopped with error and Wxchnage came back up.
0
 
LVL 118
ID: 38795281
shutdown Veeam, if it's still running just in case.

okay potentially, that's quite worrying because it could be the snapshot chain has been corrupted already, which can happen, if you run out of disk space, which you already have once, when the VM first stopped.

The only thing, is to shutdown the VM, e.g. OFF.

try again. Steps 1 to 4.

Exchange going offline is caused by high cpu, and the VM is frozen to apply to merge the snapshot.
0
 
LVL 118
ID: 38795283
it's actually getting late here in the UK, (GMT), so I'll hang around for a few more posts and responses.
0
 

Author Comment

by:cembi
ID: 38795295
Ok I will do that. Turn vm off and make sure Veeam is fully out. Exchange is not accessible anyway so it doesn't matter if vm is on or off.
0
 
LVL 118
ID: 38795302
i'll wait before going off to bed, to see if it does not stop after 10 mins, like before...
0
 

Author Comment

by:cembi
ID: 38795336
Man, things are getting strange. Just before doing what you recommended a VMware engineer calls me because he noticed I had called earlier. He tells me that no way I should shut down the VM as it seems to be an iffy situation and Exchange may not come back up. He tells me to use vConverter, install it on the VM and run it and convert it just like a P2V onto a new SAN volume (which I have handy by chance). He said this is by far the safest way of doing this.

I am perplexed. Your thoughts?!?!
0
 
LVL 118
ID: 38795364
why did he thing it was iffy, unless they have WebExed in and taken a real-time look at the situation, which we do not have the option of doing remote.

VMware always recommend the use of VMware Converter to get out of Snapshot situations! (because they do not want to spend time, supporting Customers!), and see it as a get out of jail issue. It's always the last resort. (which was my last option).

your decision.....

BUT (V2V can cause issues) - VMware probably didn't tell you also that P2V-ing an Exchange VM is not Supported by Microsoft, and can cause corruption. The official way, is to create a new Exchange Server, and Move the Mailboxes, and then remove the old Exchange Server.

See my EE Articles

HOW TO: Improve the transfer rate of a Physical to Virtual (P2V), Virtual to Virtual Conversion (V2V) using VMware vCenter Converter Standalone 5.0

HOW TO:  P2V, V2V for FREE - VMware vCenter Converter Standalone 5.0

So, options are yours......

Did they give you any instructions on how to V2V a Live Exchange 2003 Server  ?
0
 

Author Comment

by:cembi
ID: 38795374
The tech swore that V2V is by far the easiest and safest way which strangely wasn't mentioned by the very first engineer that did 2 webex sessions with me earlier today. This first guy besides suggesting the snapshot delete with VM powered on also mentionesh using putty and SSH to clone the VM onto a new LUN and in the process start with a brand new VM without the snapshots.

Have a great night and thanks a lot. I am not sure at the moment. I'll probably do something tomorrow. I am tired enough this very cold evening.
0
 
LVL 118
ID: 38795382
Yes, CLONE disk to second LUN was my second option.

But all depends on state of chain.

go with "VMware's V2V option." if you are most comfortable with their support of the Exchange 2003 VM.
0
 

Author Comment

by:cembi
ID: 38795527
Hey Andy. No, vmware tech didn't take the time to explain the v2v process. He basically was in a rush to get home as his shift was over. My feeling is that there is sth really wrong with these snapshots. The weird thing is that the last couple of times that this VM was restarted the exchange system attendant and store were not started automatically and I had to start them  manually. I am going through your articles and will read some more.
So in your opinion it would be best to try deletion with VM off and if that doesn't work use cloning and then last option would be V2V?
Have a great weekend.
0
 

Author Comment

by:cembi
ID: 38795583
I found an article online where the user had same exact issue. It seems like Veeam keeps the connection open to the VM even when it is not running a backup hence the file locked error. I will disable all Veeam services and give it another go.
0
 
LVL 118

Accepted Solution

by:
Andrew Hancock (VMware vExpert / EE MVE) earned 500 total points
ID: 38796199
I would complete the following:-

0. Backup existing VM and all files to new LUN

as follows:-

HOW TO: Clone or Copy a virtual machine in VMware vSphere Hypervisor ESX/ESXi 4.x or ESXi 5.0


1. Restart Host Server (this will clear all locks!).
2. Use procedure above whilst VM is OFF.
3. We can use CLONE option (I'll give you the commands)
4. Finally as the last resort V2V (VMware Convertor)
0
 

Author Comment

by:cembi
ID: 38796309
So I shouldn't try Delete All when Veeam is all shutdown? Any harm in doing this?
0
 

Author Comment

by:cembi
ID: 38796380
Andy, is there any way of contacting you via email?
0
 
LVL 118
ID: 38796720
all comminications at EE must be through the forum.

Terms of Use

Offline Answers
Experts Exchange is built on sharing information and solutions. The use of email or other communications systems that is outside of Experts Exchange's question and answer or articles system is prohibited, and no points will be awarded for any solution arrived at outside of those systems.

The Moderators and Topic Advisors will remove your email address if you post it; in addition to keeping you from the violation listed above, it will help keep you from getting too much unwanted email.


Source
http://www.experts-exchange.com/help/viewHelpPage.jsp?helpPageID=181

I think a host restart, and Veeam shutdown, not running, is Veeam in a VM or physical server?
0
 

Author Comment

by:cembi
ID: 38796760
OK, here is an update:
After a 3hr phone conversation with a VMware engineer from Ireland -:) we finally got the process running. What happened is that when we tried deleting snapshots with VM off it did so in an instant and basically didn't delete anything. There was still a lock on one of the files. We had to reboot the ESX host and all locks were gone. The he removed VM from inventory, edited .vmx file to point to last night's snapshot, added it to invetory and did a power on test. VM was fine and were able to power VM on without an issue. We powered down again. He removed the 8 and 9 snapshots (latest) and started the deletion again. Within 1 hr reached went gradually from 5 to 99% and it has been at that for about 2 hrs. Fingers crossed it will end successfully sometime this weekend.
He stated that the chain of snapshots didn't seem to be corrupt.

Veeam is on a physical machine but all Veeam services on that machine have been stopped.

Thank you.
0
 
LVL 118
ID: 38796887
Hence why I asked you to reboot the host to clear the locks, I'm glad VMware agree with me!

Once this has been completed, you need to add daily snapshot checks to your VMware Admin routines!

So this engineer, clearly didn't want todo a V2V using Converter, as per previous VMware Engineer!
0
 

Author Comment

by:cembi
ID: 38797000
Never doubted you man. You have been a great help and kept me focused. Actually if not for your advice my VM would have run out of space by now.
Is there a way to monitor the progress of the process while at 99%? The engineer put up a putty screen which shows all vmdk files in a list and told me to watch as they disappear while consolidating. No change after 4 hours though. I hope this is an accurate indicator.
0
 
LVL 118
ID: 38797011
there is not really much to monitor. you can login to server via SSH, and watch, but it does not show much.

Be Patient, go away, have a Mac Donalds, or sandwidch, cup of coffee, nothing worse than staring at a screen.
0
 
LVL 118
ID: 38797020
I'm actually also working on another EE related snapshot issue, they are very common!
0
 

Author Comment

by:cembi
ID: 38798526
The consolidation ended successfully in about 8 hours. VM powered on and email performance is much better.

- What should I do to avoid the snapshot issue in the future?
- What is the best way of backing up this Exchange VM? Right now I use symantec backup exec to backup information store to tape and snapshots on Equallogic and also Veeam which caused the issue. I am a bit disappointed in Veeam.
- Is there a good and simple guide to maintain an ESX infrastructure?

Perhaps these are issues for a new question so let me know if I need to do that.

Thank you Andy.
0
 
LVL 118

Assisted Solution

by:Andrew Hancock (VMware vExpert / EE MVE)
Andrew Hancock (VMware vExpert / EE MVE) earned 500 total points
ID: 38798538
The consolidation ended successfully in about 8 hours. VM powered on and email performance is much better.

Performance will be better, now it's not writing to a snapshot disk.

What should I do to avoid the snapshot issue in the future?

If you continue to use backup based products that use the VMware snapshot method, you cannot avoid it, but make daily regular checks on your VMs, or Set Alarms on vCenter Server to warn you if there is a left over snapshot, which should have been deleted as part of the backup process.

What is the best way of backing up this Exchange VM? Right now I use symantec backup exec to backup information store to tape and snapshots on Equallogic and also Veeam which caused the issue. I am a bit disappointed in Veeam.

Using SAN snapshots is good, because you can avoid using VMware Snapshots.

Veeam Backup and Replication is one of the best, but there are others to consider.

AppAssure
http://www.appassure.com/ - Number 1 Backup and VMs and Cloud

Unitrends
http://www.unitrends.com/ - a good vRecovery Backup Appliance.

Is there a good and simple guide to maintain an ESX infrastructure?

there are many books, read the documentation. Any futher details, really need new questions.
0
 

Author Comment

by:cembi
ID: 38798879
Thanks. Are the snapshots left behind because of failed backups or they will be there regardless and this consolidation process will be needed to run regularly ?
0
 
LVL 118
ID: 38798948
the backup would have been successful, but maybe flagged as failed in the backup logs.

VMs should not be left running on a snapshot after a backup, but this often occurs.

setup alerts, check daily
0
 

Author Closing Comment

by:cembi
ID: 38799440
Many thanks.
0

Featured Post

Why do Marketing keep bothering you?

Is your marketing department constantly asking for new email signature updates? Are they requesting a different design for every department? Do they need yet another banner added? Don’t let it get you down! There is an easy way to manage all of these requests...

Join & Write a Comment

This process describes the steps required to Import and Export data from and to .pst files using Exchange 2010. We can use these steps to export data from a user to a .pst file, import data back to the same or a different user, or even import data t…
HOW TO: Upload an ISO image to a VMware datastore for use with VMware vSphere Hypervisor 6.5 (ESXi 6.5) using the vSphere Host Client, and checking its MD5 checksum signature is correct.  It's a good idea to compare checksums, because many installat…
Teach the user how to install and configure the vCenter Orchestrator virtual appliance Open vSphere Web Client: Deploy vCenter Orchestrator virtual appliance OVA file: Verify vCenter Orchestrator virtual appliance boots successfully: Connect to the …
This video shows you how easy it is to boot from ISO images for virtual machines with the ISO images stored on a local datastore on the ESXi host.

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now