CorvalIT
asked on
Consolidating VMWare Snapshots Not Attached To VMDK
Good morning,
We had a very interesting issue that I'll try to outline as much detail as possible:
Our Datastore continued to run out of space, which hosted a file server. We expanded the Datastore a couple times and it kept doing it until we were out of space to allocate. We didn't expand the guest (which has a 2TB volume), but it continued to run out of space. What I noticed was a couple new hard drives (vmdks) in the folder. They were attached to anything and we didn't run snapshots on the volume, so we were perplexed.
Eventually, it was determined that our Unitrends backup system creates a snapshot to do its incremental backups, but it failed because there wasn't enough space on the volume and the snapshots just sat there. Before we knew what the disks were we deleted one of them, unmounted the 2Tb VMDK and remounted it to get the server up and running. Now there is a ton of 0KB data. The one VMDK we understand is gone and we could restore some of the data, but there is now a 400Gb snapshot sitting there that I want to consolidate back into the 2Tb volume.
Is this possible? The 2Tb volume doesn't seem to know that its missing any data or dependent on snapshots anymore.
I appreciate your help.
We had a very interesting issue that I'll try to outline as much detail as possible:
Our Datastore continued to run out of space, which hosted a file server. We expanded the Datastore a couple times and it kept doing it until we were out of space to allocate. We didn't expand the guest (which has a 2TB volume), but it continued to run out of space. What I noticed was a couple new hard drives (vmdks) in the folder. They were attached to anything and we didn't run snapshots on the volume, so we were perplexed.
Eventually, it was determined that our Unitrends backup system creates a snapshot to do its incremental backups, but it failed because there wasn't enough space on the volume and the snapshots just sat there. Before we knew what the disks were we deleted one of them, unmounted the 2Tb VMDK and remounted it to get the server up and running. Now there is a ton of 0KB data. The one VMDK we understand is gone and we could restore some of the data, but there is now a 400Gb snapshot sitting there that I want to consolidate back into the 2Tb volume.
Is this possible? The 2Tb volume doesn't seem to know that its missing any data or dependent on snapshots anymore.
I appreciate your help.
It could be possible which version of ESXi are you running because maximum virtual disk size is 2TB-512bytes including snapshot.
Unless ESXi 5.5 which Max is 62TB
ASKER
It is 5.5
Okay you will need to edit vmdks and marry the CID mismatch to merge which could end up with corrupted parent.
ASKER
Ok, I'm not trying to corrupt my data... so real solution would be appreciated if there is one.
If you have a missing snapshot in the chain any merge could result in a corrupt disk.
Is the VM currently running on a snapshot ?
Is the VM currently running on a snapshot ?
ASKER
No, which is why there's missing data. No snapshots were ever created in VMWare
These were backup snapshots, so they don't exist in the snapshot manager (never did). Unitrends creates a snapshot to backup the deltas and then deletes the snapshot. In our case it didn't delete the snapshot, because it ran out of space on the datastore. So all of our delta data is sitting in the snapshot of a failed backup.
I hope that makes sense.
These were backup snapshots, so they don't exist in the snapshot manager (never did). Unitrends creates a snapshot to backup the deltas and then deletes the snapshot. In our case it didn't delete the snapshot, because it ran out of space on the datastore. So all of our delta data is sitting in the snapshot of a failed backup.
I hope that makes sense.
Yes ALL Backup applications use VMware Snapshots to backup VMs and ALL backup apps leave VMs running on snapshots
Check your VMs daily set an Alarm in VCenter.
You deleted a link in the chain and the data is gone.
Force merge can be done but the result could be a corrupt disk.
Check your VMs daily set an Alarm in VCenter.
You deleted a link in the chain and the data is gone.
Force merge can be done but the result could be a corrupt disk.
ASKER
Is it possible to open the snapshot and get the data?
Which file did you delete?
No you cannot open the file.
ASKER
there were two vmdk's. One was 146Gbs and the other 400Gbs. I deleted the 146Gb volume.
Can u upload a screenshot of datastores?
I would post my EE article but on my mobile
I would post my EE article but on my mobile
Do you remember if it was 00001 ?
ASKER
ASKER
I actually read that yesterday, which helped me understand what the actual problem is.
ASKER
No, it was -000002.
Do you still have 0001?
ASKER
Yes, that is the 400Gb volume.
I would make a copy of the Files to work on to complete the merge.
As the result may or may not be corrupted?
Have you powered up the VM on the original parent since?
As the result may or may not be corrupted?
Have you powered up the VM on the original parent since?
You will then need to follow this VMware article...
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1007969
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1007969
ASKER
The VM is running now on the original vdmk. There are a ton of 0KB files, which I assume are on the snapshot.
You will be missing data whatever the outcome because you deleted 146GB of changes.
Which will be between the 1st and 2nd snapshot.
You may get some back in the 400GB if snapshot merges correctly without disk corruption.
Which will be between the 1st and 2nd snapshot.
You may get some back in the 400GB if snapshot merges correctly without disk corruption.
ASKER
Ok, I'll try the article you mentioned.
Is there a way to copy the 2TB vmdk to another datastore? It would be quicker than a backup I would think.
Is there a way to copy the 2TB vmdk to another datastore? It would be quicker than a backup I would think.
You can either use cut/paste at the Datastore Browser or cp (at console/ssh remnotely)
Just for background understanding....
A virtual machine o parent disk looks like:-
Working.VMDK = (parent vmdk + child snapshot-00001.vmdk + child snapshot-00002.vmdk)
Changes after the first snapshot are created are written to:-
snapshot-00001.vmdk
Changes after the second snapshot is created are stored in
snapshot-00002.vmdk
The contents of the snapshots are block changes and deltas, when used with/merged with the parent complete the Working VMDK.
If a file is lost or corrupted, your working VMDK will have missing data.
It's possible to merge the 400GB back into the parent, BUT the results may not be correct.
Just for background understanding....
A virtual machine o parent disk looks like:-
Working.VMDK = (parent vmdk + child snapshot-00001.vmdk + child snapshot-00002.vmdk)
Changes after the first snapshot are created are written to:-
snapshot-00001.vmdk
Changes after the second snapshot is created are stored in
snapshot-00002.vmdk
The contents of the snapshots are block changes and deltas, when used with/merged with the parent complete the Working VMDK.
If a file is lost or corrupted, your working VMDK will have missing data.
It's possible to merge the 400GB back into the parent, BUT the results may not be correct.
You can use the vSphere Client, or use cp /vmfs/volumes/datastore name/vm folder name/filename /
ASKER
Ok, so I made a copy and changed the CID in the 400Gb volume to match the 2Tb volumes parent CID. I removed it from inventory and readded it, and it powered up. It doesn't appear as if anything changed though. Can I consolidate it into a single volume now?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Ok, so here's what I did and was as successful as I could be given my ignorance.
1. I cloned the server to a new volume.
2. I went through and changed the CIDs and parentname per this article: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1007969
3. I removed the clone from inventory and readded it, verifying it was pointed to the snapshot disk.
4. I powered up the VM and saw the data was there with some corrupted data. I restarted and ran CheckDisk, which cleared the bad data out.
5. I copied the data from the clone server (renamed, re-ip'd, etc) to the existing file server and everyone's data was back.
Thanks a ton Andrew!
1. I cloned the server to a new volume.
2. I went through and changed the CIDs and parentname per this article: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1007969
3. I removed the clone from inventory and readded it, verifying it was pointed to the snapshot disk.
4. I powered up the VM and saw the data was there with some corrupted data. I restarted and ran CheckDisk, which cleared the bad data out.
5. I copied the data from the clone server (renamed, re-ip'd, etc) to the existing file server and everyone's data was back.
Thanks a ton Andrew!
ASKER
Andrew nailed this out of the park and frankly I wouldn't have been able to do this without his help. He saved my butt!
Many Thanks for the Kind Words!