Link to home
Start Free TrialLog in
Avatar of mirde
mirdeFlag for Canada

asked on

VMWare ESX 4 and using VCB Backup Snapshot remove failed

Hello,

I am trying to configure VMWare VCB backup in our environment however I am running at a "dead end" when trying to get a "sucesufll" backup using the VCB Framework.

VMWare ESX v.4.0.0 (B: 258672)
VMWare VCenter Server v4.0.0 (B: 258672)

Backup server is using VMWare Consolidated Backup Framework v1.5.0 (B: 226297), Windows 2008R2 x64 SP1 Enterprise OS.

The backup VM is on the same virtual infrastructure as the VM being backed up (the VM we are trying to backup is a fresh Windows 2008 R2 x64 SP1 with VMWare tools installed).

Before I start using our backup software (CA ArcServe r15) I am testing this at the VCB level to get a "successful" backup however it is failing and here is the log included:

C:\Program Files (x86)\VMware\VMware Consolidated Backup Framework>vcbMounter.exe -h 172.16.4.17:9443 -u "vand1\administrator" -p J@ng0F3tt -a ipaddr:172.16.5.101 -r C:\EDI03 -t fullvm -m nbd

[2011-05-10 12:44:17.957 'App' 4072 info] Current working directory: C:\ProgramFiles (x86)\VMware\VMware Consolidated Backup Framework
[2011-05-10 12:44:17.959 'BaseLibs' 4072 info] HOSTINFO: Seeing Intel CPU, numCoresPerCPU 1 numThreadsPerCore 1.
[2011-05-10 12:44:17.960 'BaseLibs' 4072 info] HOSTINFO: This machine has 2 physical CPUS, 2 total cores, and 2 logical CPUs.
[2011-05-10 12:44:18.498 'BaseLibs' 4072 info] Using system libcrypto, version 90709F
[2011-05-10 12:44:20.140 'BaseLibs' 4072 warning] SSLVerifyCertAgainstSystemStore: Subject mismatch: van-monitor.vand1.oppy.com vs 172.16.4.17
[2011-05-10 12:44:20.141 'BaseLibs' 4072 warning] SSLVerifyCertAgainstSystemStore: The remote host certificate has these problems:

* The host name used for the connection does not match the subject name on the host certificate

* The host certificate chain is not complete.

[2011-05-10 12:44:20.143 'BaseLibs' 4072 warning] SSLVerifyCertAgainstSystemStore: Certificate verification is disabled, so connection will proceed despite theerror
Copying "[Fibre4] Van-EDI03/Van-EDI03.vmx":
        0%=====================50%=====================100%
        **************************************************

Copying "[Fibre4] Van-EDI03/Van-EDI03-aux.xml":
        0%=====================50%=====================100%
        **************************************************

Copying "[Fibre4] Van-EDI03/Van-EDI03.nvram":
        0%=====================50%=====================100%
        **************************************************

Copying "[Fibre4] Van-EDI03//vmware-1.log":
        0%=====================50%=====================100%
        *************************************************

Copying "[Fibre4] Van-EDI03//vmware-2.log":
        0%=====================50%=====================100%
        *************************************************

Copying "[Fibre4] Van-EDI03//vmware-3.log":
        0%=====================50%=====================100%
        **************************************************

Copying "[Fibre4] Van-EDI03//vmware-4.log":
        0%=====================50%=====================100%
        *************************************************

Copying "[Fibre4] Van-EDI03//vmware.log":
        0%=====================50%=====================100%
        **************************************************

Converting "C:\EDI03\scsi0-0-0-Van-EDI03.vmdk" (compact file):
        0%=====================50%=====================100%
        **************************************************

Converting "C:\EDI03\scsi0-1-0-Van-EDI03_1.vmdk" (compact file):
        0%=====================50%=====================100%
        **************************************************

[2011-05-10 13:30:52.567 'vcbMounter' 4072 warning] Snapshot deletion failed. Attempting to clean up snapshot database...
[2011-05-10 13:31:03.011 'BaseLibs' 2980 warning] SSLVerifyCertAgainstSystemStore: Subject mismatch: van-monitor.vand1.oppy.com vs 172.16.4.17
[2011-05-10 13:31:03.011 'BaseLibs' 2980 warning] SSLVerifyCertAgainstSystemStore: The remote host certificate has these problems:

* The host name used for the connection does not match the subject name on the host certificate

* The host certificate chain is not complete.

[2011-05-10 13:31:03.011 'BaseLibs' 2980 warning] SSLVerifyCertAgainstSystemStore: Certificate verification is disabled, so connection will proceed despite theerror
[2011-05-10 13:31:25.479 'vcbMounter' 4072 error] Error: Other error encountered: Snapshot remove failed: Unable to access file <unspecified filename> since it is locked
[2011-05-10 13:31:25.479 'vcbMounter' 4072 error] An error occurred, cleaning up...
[2011-05-10 13:31:25.963 'vcbMounter' 4072 warning] Snapshot deletion failed. Attempting to clean up snapshot database...

Deleted directory C:\EDI03

C:\Program Files (x86)\VMware\VMware Consolidated Backup Framework>

Open in new window


I think for the vcbConverter to consider a backup successful it tries to remove the VM snapshot, however this is the failing part; something is locking it on the VM Host? Thought, no idea what that is.

I do know that even if I go back to my vSphere Client and look at the snapshots for the particular VM, it shows there is a snapshot however it does not show it in the snapshot manager. If I create a new snapshot I can then see something different (screenshot attached).

To successfuly remove all the snapshots I turn the box off, move it to another VM Host, take a snapshot and then go back to Snapshot Manager and select "Delete All".

Anyone out there with a similar issue or any idea on how I can fix this locking issue? I have been on VMWare support for the last two days back and forth without any progress; wanted to run this by EE :)
Avatar of mirde
mirde
Flag of Canada image

ASKER

Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Something to check here....

Can you manually take a Snapshot of the Virtual Machine, with the following options.

1. No tick in Snapshot Memory
2. Tick in the Quiece Virtual Machine

Let me know the result?
you've also got about 10+ Snapshots?

has this been caused by the VCB?
Avatar of mirde

ASKER

Yes from VCB tests; nothing manual.
Can you manually take a Snapshot of the Virtual Machine, with the following options.

1. No tick in Snapshot Memory
2. Tick in the Quiece Virtual Machine

Let me know the result?
Avatar of mirde

ASKER

Both VM are the SAN (backup VM and the VM being backed up (VAN-EDI03).

The EDI03 is on Fibre, the Backup is on SATA.

Will let you know the results of VM Snapshot with the specific options shortly.
what is the SAN?

Does this happen with ALL VMs being backed up?

Avatar of mirde

ASKER

I have been using VAN-EDI03 as my test case, which is a fresh copy of Windows 2008 R2 x64 SP1.

Could try building another VM as I have Win2008R2 templates to deploy from.
Avatar of mirde

ASKER

The LUN is are 2TB. Multiple LUNs in different Raid configuration (Raid 5, Raid 0).

Mix of FC Hard Drives and SATA Hard Drives for different LUNs.

Still waiting for a process to finish before I try the snapshot you requested.
okay 50 and 100GB, I can see that.

it's odd, something has a lock on the vmdk file, snapshot file which is preventing it from being deleted.
Avatar of mirde

ASKER

There are two disks:

DISK 1 - C:\, VMDK = 50GB Thick Provisioning
DISK 2 = D:\ VMDK = 100GB Thick Provisioning
Avatar of mirde

ASKER

Its almost as if the vcbConverter.exe that runs and creates the snapshot never releases it for the delete process.
Avatar of mirde

ASKER

Also, I did a Snapshot with:

1. No tick in Snapshot Memory
2. Tick in the Quiece Virtual Machine

Successful, and I can verify now by going to VM Properties the Hard Disk 1 & 2 are pointing to the VAN-EDI03-000001.vmdk and EDI03_1-000001.vmdk files.
well vcbconverter interfaces to the storage APIs, and the storage API creates and committs the snapshot.

just check these disks are not part of any other VM
Avatar of mirde

ASKER

Strange thing is, taking that snapshot and deleting it through vSphere Client has no issues, no locking error.

I think its isolated to vcbCenter and the storage API.

Is there a way to find out what is locking the file being logged into the VM Host of the VM Instance?
has VMware asked for a vm-support, or Support Log Bundle yet?

have they examine the logs?
Avatar of mirde

ASKER

Not yet they have not, initially I had issues bypassing the "SSL Handshake" issue which turns out because my port for where VCenter Server is is not default (443) bur rather 9443; now stuck on this error.

Going to push for escalation as I think I am still stuck in the loop of their primary /first line of support.
Avatar of mirde

ASKER

Any particular logs I should be looking at for this?
I'm surprised they've not had a moan at you about using ESX 4.0.0 and vCenter 4.0.0!

/var/log/vmkernel

/var/log/vmkwarning

Avatar of mirde

ASKER

We will be upgrading to the latest in the next few weeks; the server software was on a 2003 x86 box and 4.1 I think needs to be on something higher (OS or x64) which we can finally do now that best practice suggest VCenter software should reside on the virtual infrastructure itself and not a bare bone hardware.
you could do a tail -f kernel (latest kernel file)

and just before you do this, kick off a manual vcbMounter.exe
Yes VC does require 64 bit now.

We deploy all VCs on 64 bit VMs.
Avatar of mirde

ASKER

Correct, the latest version of VCB Framework.
Have VMware agreed VCB 1.5 Build 2 is okay with ESX 4.0.0?

do you know if they have tested or confirmed this shouldn't be an issue?
Avatar of mirde

ASKER

I have relayed your last message to my open case with them, that is a good question; in a perfect world it should be backwards compatible right ? :) :)
Well, what I mean it's suggested is compatible with ESX 4.0 and VC 4.0.

I knew that this was created for ESX 4.1 U1, because users complained because they are pulling support for the product.
Avatar of mirde

ASKER

Pulling support for which product? VCB?
ASKER CERTIFIED SOLUTION
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
probably need to sift through you logs. if there is a process lock on the disk file.
Avatar of mirde

ASKER

Interesting, though I tough VCB is a new replacement for VDDK right?

Maybe I am not using the right protocol for my backup? Any better suggestion? VCB? VDDK? VADP?

The product that will be ultimately running the nightly back-up job is CA ArcServe r15 (SP1) and is pretty current (2011). They are currently in beta with r16 that supports CBT (Change Block Tracking).

Maybe I should use VADP? Might have better luck with that.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
the other nail poised to be hammered into VCB's coffin is that it depends upon Converter to do restores with and the last release of Converter was supposed to have removed support for VCB (but I can't recall if it went to release that way or not)
Avatar of mirde

ASKER

Interesting feedback, I think I just crossed off VCB off my list as its already started giving me trouble before I even tried backing up with ArcServe.

Though I tough ArcServe makes use of VCB to back-up VMWare VM.. will need to look at their guide to see what else I have available as I just renewed our maintenance and added ArcServe VMWare Backup Agents.
Don't get me wrong VCB is okay, I think it's just been overtaken by other vendors products.

It was never a full backup product. But remember this VMware *SNAPSHOTS* are a function of ALL Third Party backup products if you want backups of VMs. (at block level, out side of the VM). They all interface with the APIs to call backups.

Due to this very issue, years ago, we stopped using products that used the Snapshot process, because Snapshots are always troublesome. (they stick, stop, don't quices the VM), heavily loaded VMs can cause issues.

So we backup machines using SAN Based Snapshots, and we don't bother snapshotting the VM at the same time, and some people will flame and moan about this, and state your backups are therefore only crash consistent, e.g. as good as if the machine has crashed!

but that works for us, may not work for everybody, but it works for us. (and others!)
Avatar of mirde

ASKER

Though not a "fix" for my issue, it pointed me to the right process; taking a look at the logs and pushing VMWare support to look at them.

Also, suggested alternatives, such as VDDK and described the technology as it evolved.

If I could add more than 500pts to you I would.

Thanks.
Avatar of koolkid1976
koolkid1976

Your files are being locked by the CA services on your vCenter server. To delet the snapshots, go into your vCenter server and stop all services starting with CA. Then use the vCenter console to take another snapshot of the VM that has the locked snapshot file. Then go to Snapshot Manger and "Delete All". This should delete the snapshot you just made, as well as the other previously locked snapshot/s.
If that does not work, you'll have to migrate your VM to another ESX host, make another snapshot, then select "Delete All."
Updating to ESX 5.0 resolved it all!