options to use when Disk Consolidation does not work

options to use when Disk Consolidation does not work

I have reading about the options that might resolved  Disk Consolidation issues and some of them are here:


1 - Create a new snapshot and remove it, this might complete the consolidation

2 - vMotion the VM to another host and try to consolidate again

3 - Restart the management agents on the host where the VM is running and try to consolidate again

4 - Restart the Backup Services (one of the backup  services might be locking the snapshot files, preventing them from consolidating)

5- Shutdown the VM, try to consolidate again


Regarding Option 3, I would like to know what would be the impact when your restart the management agents on the host. will this migrate the VMs to another host with no issue ?

if you have other options that can fix Disk Consolidation issue, please provide the options.

Thank you
jskfanAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
options to use when Disk Consolidation does not work

it's stupid, and VMware's method for the novice!!!! Administrators need to understand what is really happening, rather than just hitting the "Disk Consolidate" button!

Regarding Option 3, I would like to know what would be the impact when your restart the management agents on the host. will this migrate the VMs to another host with no issue ?

It just stops the agent, so vCenter Server cannot manage or communicate with the host. It may stop tasks....

it will NOT migrate  the VMs to other hosts,  but it will not affect running VMs.

Read my EE Article and become familiar with evil snapshots...

HOW TO: VMware Snapshots :- Be Patient

You need to work the problem, and those options are a start.....there is no single solution to the "Snapshot issues" other than DO NOT USE THEM, or any backup software which also uses them!

We would never use option 2, or use the consolidate option.
Mr TorturSystem EngineerCommented:
Hi,

I had the case many times where this option did not work.
Well it seems to be a little dumb option as commented by Andrew.

In my cases, there was issues ni the snapshot chains.
AFAIK, for each VMDK, if you use the CLI you'll see you have 2 files :
vm_name.vmdk
vm_name-flat.vmdk
and with a snapshot on this VM you'll also have :
vm_name-00001.vmdk
vm_name-00001-delta.vmdk


The first is a descriptor, text file
The second really contains your VM datas

First be carefull what you do in CLI with theses files!
But if you open the descriptor you'll see an ID (CID) and a parent ID
CID is an alpha numeric value (0-9, a-f), this is your VMDK ID
And Parent CID is the parent VMDK in the snapshot chain

In my example above the parent CID of vm_name.vmdk will be itself
And the parent CID of the snapshot file will be the vm_name.vmdk CID


Check these values, if there is a mismatch in the descriptors.
It should be
VMDK1         parent CID=CID
snap1           parent CID=VMDK1 CID
snap2           parent CID=snap1 CID
etc...

sometimes these IDs could get corrupted, and there is the same CID on VMDK and snapshot, or a bad parent CID in the snapshot descriptor, etc..
You can edit the descriptors and make corrections, then retry a consolidate option in vcenter.
I did this many times.
Also your own issue could be elsewhere.
jskfanAuthor Commented:
Andrew,

which of the links  point to resolve Consolidation issue, I see several links on here:

HOW TO: VMware Snapshots :- Be Patient
SolarWinds® VoIP and Network Quality Manager(VNQM)

WAN and VoIP monitoring tools that can help with troubleshooting via an intuitive web interface. Review quality of service data, including jitter, latency, packet loss, and MOS. Troubleshoot call performance and correlate call issues with WAN performance for Cisco and Avaya calls

Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Read the article and understand what snapshots do.... and how to check for their existence.... this article shows one method of dealing with them!

They are 101 reasons WHY A Snapshot exists on a VM.... and how to deal with them.... which could be many factors......

that will come with experience as you deal with snapshots more....but first understanding what they do, how they are created is a start....

A Book could be written on Snapshots alone!
jskfanAuthor Commented:
Sorry, I went through through the Link you posted,  it explains what the snapshot is and things to consider about snapshots.
However I have not seen quick work arounds to get rid of it.

on other articles:
you can create new snapshot the  delete it  while VM is powered off might speed up the deletion
you can clone VM  while it powered off

**if VM is powered off, it means downtime..so it is not good solution

stop backup Services  if snapshot was created by the backup software
Host Vmotion  might help

***I am not sure how Host Vmotion can be Helpful, since the file is still in the same datastore.

Storage vmotion might help  
***Not sure if this will help either

Any Expert who went through troubleshooting the Disk Consolidation ?
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Sorry, I went through through the Link you posted,  it explains what the snapshot is and things to consider about snapshots.
However I have not seen quick work arounds to get rid of it.

There is one method of how to deal with a snapshot, it is not an article which covers the hundreds of different scenarios which can cause a snapshot and how to deal with it.

***I am not sure how Host Vmotion can be Helpful, since the file is still in the same datastore.

it will not.

Storage vmotion might help

it will not.

I believe the following questions have been answered.


1 - Create a new snapshot and remove it, this might complete the consolidation

2 - vMotion the VM to another host and try to consolidate again

3 - Restart the management agents on the host where the VM is running and try to consolidate again

4 - Restart the Backup Services (one of the backup  services might be locking the snapshot files, preventing them from consolidating)

5- Shutdown the VM, try to consolidate again

Are you asking for an Answer which covers all the different scenarios of how to cope with a snapshot if you find it ?

and a flow chart, of what to do, with yes and no decision trees ?

Do you understand the conditions, that a snapshot is left ?

Do you understand what a snapshot is ?

If you've grasped this basic knowledge on snapshots, then we can proceed with trying to teach you how to analyse, and diagnose the issue, which can lead to a cure, with VMware vSphere (ESXi) skills you may already have. Because trying to provide an answer to the many different scenarios a snapshot may exist and how to deal with, it's going to be long.....if not impossible,
jskfanAuthor Commented:
Flowchart  would be good.
jskfanAuthor Commented:
Andrew,
Where is the troubleshooting Flowchart  for  snapshots ?
nociSoftware EngineerCommented:
@jskfan,

I will try to explain things a little... If you need real advice Andrew has way more experience on this.
I think Andrew is right with pointing to the right article...

A flow chart would be near impossible to create (if you need one for every  possible situation and way to got there) ==> how to get out.

Short: to get rid of old snapshots you need to get rid of ALL of them.  And it might take a while to get that done.
Anything else you try will not reduce the amount of data needed or storage.... and it may explode in the mean time.

Making a snapshot effectively freezes the original disk and then creates a journal of changes since that moment.
Making a new snapshot does the same  etc. etc.
deleting a snapshot will cause the journal  to be applied to the original disk..  or previous journal.
This may consume huge amount of intermediate disk space due to the journals.  
And until a journal is completely written to the previous image it cannot be removed. Any updates to the disk from the VM will still be appended to the journals.
(otherwise a replay from the journal may overwrite newly written data, or inconsistent results may be produced).
So it is probably best to delete the oldest journal first, so a newer journal will not get applied to older journals first.

Therefore handling the journals will take time and even more so when the system is still running.
Also reading from a disk would mean to not only read from the original disk but also search through all "snapshot" journals if there exists any update on that data.
So try to follow Andrews advise on this.  And read up on snapshots as well as anything linked to the subject.

Think of snapshots as a double edged sword where the hilt grip also is a knife. In short handle with care if you know howto, and if you have half a clue about it then first try it out in a toy environment.
jskfanAuthor Commented:
Actually Andrew mentioned the Flowchart..this is why I asked for it , my initial question was about the work arounds that can resolve Snapshot  issues when they do not want to get  deleted. I found online the options indicated below, but there might be more other options:


1 - Create a new snapshot and remove it, this might complete the consolidation

2 - vMotion the VM to another host and try to consolidate again

3 - Restart the management agents on the host where the VM is running and try to consolidate again

4 - Restart the Backup Services (one of the backup  services might be locking the snapshot files, preventing them from consolidating)

5- Shutdown the VM, try to consolidate again
nociSoftware EngineerCommented:
The snapshot handling is on the Host side of the VMHost/Guest bridge (The Guest has no knowledge on it)
So i fail to see how restarting management agent within a guest could work....
vMotion is moving a VM to another host...., that should NOT interfere of influence storage settings... So how can that help?
Restarting backup services..., well IF backup is done through snapshots then Yes that may be of influence..., Q1 in that case do you have hanging backups... if backups are finished thenthere should be no locks....   if it still locks stuff, you need different backup software (reliability).
When a VM is shutdown ... then consolidation should be more speedy as there is no constant IO load from the VM.... So there can actually be some help, but restarting the VM will cause that load again.... then again the running of the guest should not iterfere with snapshotmanagement as the Guest should have no knowledge about the snapshot exisiting etc.).
Assuming that creating a new snapshot and then deleting it again will start a consolidation process seems naive to me, then the snapshot software is rotten and should not be used at all (lack of reliability), or it should not be used because it dows more then explicitely requested.
(resolve 2 consolidations at one command).

If consolidation fails you are maybe starting from the wrong end... try starting from the oldest to te newest. (That will cause less work at every step), as all IO can be done to the final image. If you remove an intermediate then what can be consolidated... you will still be stuck with an old one.   Note i am not a VMWare user... (except for VMWare station),  but this is how snapshots in general work.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
VMware

From novice to tech pro — start learning today.