Link to home
Start Free TrialLog in
Avatar of CorvalIT
CorvalIT

asked on

Consolidating VMWare Snapshots Not Attached To VMDK

Good morning,
We had a very interesting issue that I'll try to outline as much detail as possible:
Our Datastore continued to run out of space, which hosted a file server.  We expanded the Datastore a couple times and it kept doing it until we were out of space to allocate.  We didn't expand the guest (which has a 2TB volume), but it continued to run out of space.  What I noticed was a couple new hard drives (vmdks) in the folder.  They were attached to anything and we didn't run snapshots on the volume, so we were perplexed.  

Eventually, it was determined that our Unitrends backup system creates a snapshot to do its incremental backups, but it failed because there wasn't enough space on the volume and the snapshots just sat there.  Before we knew what the disks were we deleted one of them, unmounted the 2Tb VMDK and remounted it to get the server up and running.  Now there is a ton of 0KB data.  The one VMDK we understand is gone and we could restore some of the data, but there is now a 400Gb snapshot sitting there that I want to consolidate back into the 2Tb volume.

Is this possible?  The 2Tb volume doesn't seem to know that its missing any data or dependent on snapshots anymore.

I appreciate your help.
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

It could be possible which version of ESXi are you running because maximum virtual disk size is 2TB-512bytes including snapshot.
Avatar of CorvalIT
CorvalIT

ASKER

It is 5.5
Okay you will need to edit vmdks and marry the CID mismatch to merge which could end up with corrupted parent.
Ok, I'm not trying to corrupt my data... so real solution would be appreciated if there is one.
If you have a missing snapshot in the chain any merge could result in a corrupt disk.

Is the VM currently running on a snapshot ?
No, which is why there's missing data.  No snapshots were ever created in VMWare

These were backup snapshots, so they don't exist in the snapshot manager (never did).  Unitrends creates a snapshot to backup the deltas and then deletes the snapshot.  In our case it didn't delete the snapshot, because it ran out of space on the datastore.  So all of our delta data is sitting in the snapshot of a failed backup.

I hope that makes sense.
Yes ALL Backup applications use VMware Snapshots to backup VMs and ALL backup apps leave VMs running on snapshots

Check your VMs daily set an Alarm in VCenter.

You deleted a link in the chain and the data is gone.

Force merge can be done but the result could be a corrupt disk.
Is it possible to open the snapshot and get the data?
there were two vmdk's.  One was 146Gbs and the other 400Gbs.  I deleted the 146Gb volume.
Can u upload a screenshot of datastores?

I would post my EE article but on my mobile
2014-07-02-10-44-27.pdf

Attached as requested
I actually read that yesterday, which helped me understand what the actual problem is.
No, it was -000002.
Yes, that is the 400Gb volume.
I would make a copy of the Files to work on to complete the merge.

As the result may or may not be corrupted?

Have you powered up the VM on the original parent since?
The VM is running now on the original vdmk.  There are a ton of 0KB files, which I assume are on the snapshot.
You will be missing data whatever the outcome because you deleted 146GB of changes.

Which will be between the 1st and 2nd snapshot.

You may get some back in the 400GB if snapshot merges correctly without disk corruption.
Ok, I'll try the article you mentioned.  

Is there a way to copy the 2TB vmdk to another datastore?  It would be quicker than a backup I would think.
You can either use cut/paste at the Datastore Browser or cp (at console/ssh remnotely)

Just for background understanding....

A virtual machine o parent disk looks like:-

Working.VMDK =  (parent vmdk + child snapshot-00001.vmdk +  child snapshot-00002.vmdk)

Changes after the first snapshot are created are written to:-

snapshot-00001.vmdk

Changes after the second snapshot is created are stored in

snapshot-00002.vmdk

The contents of the snapshots are block changes and deltas, when used with/merged with the parent complete the Working VMDK.

If a file is lost or corrupted, your working VMDK will have missing data.

It's possible to merge the 400GB back into the parent, BUT the results may not be correct.
You can use the vSphere Client, or use cp /vmfs/volumes/datastore name/vm folder name/filename /
Ok, so I made a copy and changed the CID in the 400Gb volume to match the 2Tb volumes parent CID.  I removed it from inventory and readded it, and it powered up.  It doesn't appear as if anything changed though.  Can I consolidate it into a single volume now?
ASKER CERTIFIED SOLUTION
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Ok, so here's what I did and was as successful as I could be given my ignorance.

1. I cloned the server to a new volume.
2. I went through and changed the CIDs and parentname per this article: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1007969
3. I removed the clone from inventory and readded it, verifying it was pointed to the snapshot disk.
4. I powered up the VM and saw the data was there with some corrupted data.  I restarted and ran CheckDisk, which cleared the bad data out.
5. I copied the data from the clone server (renamed, re-ip'd, etc) to the existing file server and everyone's data was back.

Thanks a ton Andrew!
Andrew nailed this out of the park and frankly I wouldn't have been able to do this without his help.  He saved my butt!