Link to home
Start Free TrialLog in
Avatar of sevenet
sevenetFlag for Poland

asked on

Host is in a state that cannot perform snapshot

Hello,

We have a weird case with backing up VMs on VMware ESX 5.5. It doesn't matter if we use native VMware's VDP, EMC Networker's VBA or VADP solutions, the message is always the same:

2013-12-17 03:00:34 avnwcomm Error <0000>: Plugin returned non-zero exit code (10059). Aborting NetWorker connection.
2013-12-17 03:00:24 avvcbimage Warning <6788>: Found unknown section 'browse-from-catalog' in workorder, ignoring
2013-12-17 03:00:25 avvcbimage Warning <6788>: Found unknown section 'browse-from-catalog' in workorder, ignoring
2013-12-17 03:00:34 avvcbimage Warning <16004>: Soap fault detected, Could not connect to getHostMoref 'esx2.local'., Msg:'SOAP 1.1 fault: SOAP-ENV:Client [no subcode]
2013-12-17 03:00:34 avvcbimage FATAL <0000>: [IMG0009] The Host 'esx2.local' is in a state that cannot perform snapshot operations.
2013-12-17 03:00:34 avvcbimage Error <0000>: [IMG0009] createSnapshot: snapshot creation failed

The google is silent about it. We have restarted services, physical hosts, vcenter, VDP, VBA, Networker, nothing helps. The weirdest thing is that it was working for about a week, and then stopped. No known changes were made to VMware environment. Does anyone have an idea what could be the cause or at least how to diagnose deeply this issue?

Best regards,
Krzysztof
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

Does this happen if you manully snapshot the VM?
try to  rebooting the vsphere host first and then see.
1, please see that the Virtual machine startup / shutdown with host feature is disabled before this step

2. Restart the management agents on the host

3. On vCenter Server, restart the vCenter and Management Web servicesand
and then try to take snapshot.

and the very important ...does this happen only with one vm or any vm ??



please share the output
Avatar of compdigit44
compdigit44

Did you have the same issue running on ESXi 5.1?
If you connect to the ESXi 5.5 host directly using the VI Client are you able to snapshot the VM?
Avatar of sevenet

ASKER

Sorry for a little delay.

Manual snapshots work OK.

We have restarted management agents on the host, and the host itself too, at the end.

We have restarted vcenter services and the server itself, countless times

It happens to all VMs, regardless of operating system or host on which it runs.

We didn't have a chance to check 5.1 in this environment, as it was already running 5.5, however, we have a lab environment with 5.1 and there are no such issues at all. We cannot downgrade production environment to 5.1 :-(

The interesting thing is that Networker's VBA from EMC was working for about a week, and then stopped with this message (we do not recall any major changes to the environment which could have caused the failure). Then, we tried VADP frmo EMC, and finally to eliminate Networker's problem we installed VDP from VMware. The message is the same. It seems it is originated by either ESX or vCenter (most probable). We plan to reinstall vCenter, but this is rather long-term plan if other ideas fail.
If this is causing long standing service issues, I would now escalate to VMware Support.
Avatar of sevenet

ASKER

Yes, we have opened the case, but so far, for about two weeks, noone is able to track the root cause. We have a webex session scheduled for tomorrow. I though maybe someone already found the solution :-) From what I google around, many people have similiar issues. It looks like some kind of ghost problem :-/
What build number of ESXi 5.5 are you using? Also are using an OEM version of ESXi of the generic VMware version from the VMware site?

Are the device drivers updated on your host? BIOS, NIC's, Storage HBA etc....

Any errors for the host in your SAN logs??

What type of permissions is the backup agent running under ?
Avatar of sevenet

ASKER

vCenter is 5.5.0 build 1312298

ESXi is generic installation (ISO from VMware): VMware ESXi, 5.5.0, 1331820

We have checked permissions. Just to make sure there are no issues we use administrator account.

There are no errors on SAN. All datastores work OK.
Avatar of sevenet

ASKER

Here is the most recent log from VDP:
--------------------------------------------------------------------------------------------------------
-----  START avvcbimage log 2013-12-30 23:00:05 CET  [7.0.181-86 Linux-x86_64]
--------------------------------------------------------------------------------------------------------

2013-12-30T23:00:05.321+1:00 avvcbimage Info <5008>: Logging to /usr/local/avamarclient/var-proxy-5/systems_daily-systems_daily-1388440800090-9f618cf4cfbe04eb51e7c62438d310bd7a7d9356-1016-vmimagel.log
2013-12-30T23:00:05.321+1:00 avvcbimage Info <5174>: - Reading /root/.avamar
2013-12-30T23:00:05.321+1:00 avvcbimage Info <5174>: - Reading /usr/local/avamar/etc/usersettings.cfg
2013-12-30T23:00:05.321+1:00 avvcbimage Info <5174>: - Reading /usr/local/avamarclient/var-proxy-5/avvcbimage.cmd
2013-12-30T23:00:05.321+1:00 avvcbimage Info <5174>: - Reading /usr/local/avamarclient/var/avvcbimageAll.cmd
2013-12-30T23:00:05.325+1:00 avvcbimage Info <6673>: CTL listening on port 47040
2013-12-30T23:00:05.327+1:00 avvcbimage Info <10684>: Setting ctl message version to 3 (from 1)
2013-12-30T23:00:05.328+1:00 avvcbimage Info <16136>: Setting ctl max message size to 268435456
2013-12-30T23:00:05.332+1:00 avvcbimage Warning <6788>: Found unknown section 'browse-from-catalog' in workorder, ignoring
2013-12-30T23:00:05.335+1:00 avvcbimage Warning <17809>: VAMI Statistics Push Provider DISABLED.
2013-12-30T23:00:05.359+1:00 avvcbimage Info <18664>: Login(https://vcenter.local:443/sdk) Datacenter: 'PRODUCTION'
2013-12-30T23:00:05.360+1:00 avvcbimage Info <0000>:      - connected to 'VirtualCenter' - version: 'VMware vCenter Server 5.5.0 build-1312298',  apiVersion:'5.5' 
2013-12-30T23:00:05.481+1:00 avvcbimage Info <14692>: Login test is OK.
2013-12-30T23:00:05.713+1:00 avvcbimage Info <0000>: VM Info: VMX version 4(F), ctk enabled(T), fault tolerant (F), Guest OS CentOS 4/5/6 (64-bit), Cloud UUID = NoUUID. version=8
2013-12-30T23:00:05.797+1:00 avvcbimage Info <14625>: WorkOrder Disk Info:
  Disk '2000': file(base):'[LUN0-ESX] dns/dns.vmdk', backItUp=1
               snapshot file:'[LUN0-ESX] dns/dns.vmdk'
               prior size(KB):0, current size(KB):10485760, match=0
               prior change block ID:''
               Datastore:'LUN0-ESX' Directly Accessible=1
2013-12-30T23:00:05.798+1:00 avvcbimage Info <11979>: proxy VM DNS name is: 127.0.0.2,vdp0.local,vdp0,10.1.211.20
2013-12-30T23:00:05.798+1:00 avvcbimage Info <18672>: proxy VM Instance name is: vdp0-proxy-5
2013-12-30T23:00:05.798+1:00 avvcbimage Info <17793>: proxy VM Moref is: vm-726
2013-12-30T23:00:05.798+1:00 avvcbimage Info <16010>: vCenter 'PRODUCTION' is vcenter.local
2013-12-30T23:00:05.798+1:00 avvcbimage Info <11981>: VM's host is esx1.local
2013-12-30T23:00:05.798+1:00 avvcbimage Info <11982>: VM's primary storage location is [LUN0-ESX] dns/dns.vmx
2013-12-30T23:00:05.798+1:00 avvcbimage Info <11983>: VM's vCenter ID is vm-98
2013-12-30T23:00:06.545+1:00 avvcbimage Warning <16004>: Soap fault detected, Could not connect to getHostMoref 'esx1.local'., Msg:'SOAP 1.1 fault: SOAP-ENV:Client [no subcode]
"End of file or no input: Operation interrupted or timed out"
Detail: [no detail]
'
2013-12-30T23:00:06.554+1:00 avvcbimage Warning <16004>: Soap fault detected, Get Host properties problem, Msg:'SOAP 1.1 fault: "":ServerFaultCode [no subcode]
"
Required property obj is missing from data object of type ObjectSpec

while parsing serialized DataObject of type vmodl.query.PropertyCollector.ObjectSpec
at line 2, column 651

while parsing property "objectSet" of static type ArrayOfObjectSpec

while parsing serialized DataObject of type vmodl.query.PropertyCollector.FilterSpec
at line 2, column 445

while parsing call information for method RetrieveProperties
at line 2, column 273

while parsing SOAP body
at line 2, column 258

while parsing SOAP envelope
at line 2, column 0

while parsing HTTP request for method retrieveContents
on object of type vmodl.query.PropertyCollector
at line 1, column 0"
Detail: <InvalidRequestFault xmlns="urn:vim25" xsi:type="InvalidRequest"></InvalidRequestFault>
'
2013-12-30T23:00:07.030+1:00 avvcbimage Warning <16004>: Soap fault detected, Could not connect to getHostMoref 'esx1.local'., Msg:'SOAP 1.1 fault: SOAP-ENV:Client [no subcode]
"End of file or no input: Operation interrupted or timed out"
Detail: [no detail]
'
2013-12-30T23:00:07.039+1:00 avvcbimage Warning <16004>: Soap fault detected, Get Host properties problem, Msg:'SOAP 1.1 fault: "":ServerFaultCode [no subcode]
"
Required property obj is missing from data object of type ObjectSpec

while parsing serialized DataObject of type vmodl.query.PropertyCollector.ObjectSpec
at line 2, column 651

while parsing property "objectSet" of static type ArrayOfObjectSpec

while parsing serialized DataObject of type vmodl.query.PropertyCollector.FilterSpec
at line 2, column 445

while parsing call information for method RetrieveProperties
at line 2, column 273

while parsing SOAP body
at line 2, column 258

while parsing SOAP envelope
at line 2, column 0

while parsing HTTP request for method retrieveContents
on object of type vmodl.query.PropertyCollector
at line 1, column 0"
Detail: <InvalidRequestFault xmlns="urn:vim25" xsi:type="InvalidRequest"></InvalidRequestFault>
'
2013-12-30T23:00:07.039+1:00 avvcbimage Info <0000>: Proxy Host Name=esx1.local() VM Host Name=esx1.local()
2013-12-30T23:00:07.039+1:00 avvcbimage Info <11984>: VM's backup account on vdp0.local is /vcenter.local/VirtualMachines/dns_Unag9GyalhQeq8btPpE68g
2013-12-30T23:00:07.039+1:00 avvcbimage Info <11986>: Changed block tracking is engaged for this VM
2013-12-30T23:00:07.039+1:00 avvcbimage Info <11988>: A reference to a valid prior backup is not available so this will be a full level zero backup.
2013-12-30T23:00:07.050+1:00 avvcbimage Warning <6788>: Found unknown section 'browse-from-catalog' in workorder, ignoring
2013-12-30T23:00:07.053+1:00 avvcbimage Info <5174>: - Reading /usr/local/avamarclient/var-proxy-5/avvcbimage.cmd
2013-12-30T23:00:07.053+1:00 avvcbimage Info <5174>: - Reading /usr/local/avamarclient/var/avvcbimageAll.cmd
2013-12-30T23:00:07.053+1:00 avvcbimage Info <7084>: target[0]=[LUN0-ESX] dns/dns.vmdk
2013-12-30T23:00:07.067+1:00 avvcbimage Info <18664>: Login(https://vcenter.local:443/sdk) Datacenter: 'PRODUCTION'
2013-12-30T23:00:07.067+1:00 avvcbimage Info <0000>:      - connected to 'VirtualCenter' - version: 'VMware vCenter Server 5.5.0 build-1312298',  apiVersion:'5.5' 
2013-12-30T23:00:07.084+1:00 avvcbimage Info <14692>: Login test is OK.
2013-12-30T23:00:07.109+1:00 avvcbimage Info <16041>: VDDK:OBJLIB-LIB: Objlib initialized.

2013-12-30T23:00:07.109+1:00 avvcbimage Info <16041>: VDDK:VixDiskLib: Attempting to locate advanced transport module in "/usr/lib/vmware-vix-disklib".

2013-12-30T23:00:08.151+1:00 avvcbimage Info <16041>: VDDK:VixDiskLib: Advanced transport plugin was successfully loaded into vixDiskLib.

2013-12-30T23:00:08.152+1:00 avvcbimage Info <16041>: VDDK:VixDiskLib: Enabling advanced transport modes.

2013-12-30T23:00:08.152+1:00 avvcbimage Info <16041>: VDDK:OBJLIB-LIB: Objlib initialized.

2013-12-30T23:00:08.153+1:00 avvcbimage Info <16041>: VDDK:VixDiskLib: Initialize transport modes.

2013-12-30T23:00:08.154+1:00 avvcbimage Info <16041>: VDDK:VixDiskLib: Available transport modes: file:san:hotadd:nbdssl:nbd.

2013-12-30T23:00:08.154+1:00 avvcbimage Info <16041>: VDDK:VMware VixDiskLib (5.5) Release build-1284542

2013-12-30T23:00:08.234+1:00 avvcbimage Info <16041>: VDDK:VixDiskLibVim: VixDiskLibVim_Init: Initialization is completed.

2013-12-30T23:00:08.234+1:00 avvcbimage Info <9666>: Available transport modes are file:san:hotadd:nbdssl:nbd
2013-12-30T23:00:08.234+1:00 avvcbimage Info <16041>: VDDK:VixDiskLib: VixDiskLib_PrepareForAccess: Prepare to access disk.

2013-12-30T23:00:08.234+1:00 avvcbimage Info <16041>: VDDK:VixDiskLib: VixDiskLib_Connect: Establish connection.

2013-12-30T23:00:08.234+1:00 avvcbimage Info <16041>: VDDK:VixDiskLibVim: VixDiskLibVim_AllowVMotion: Disable VMotion.

2013-12-30T23:00:10.761+1:00 avvcbimage Info <17818>: VixDiskLib vMotion reservation successfully enabled by 'vSphere Data Protection'
2013-12-30T23:00:10.761+1:00 avvcbimage Info <16011>: Running cleanup of old backups.
2013-12-30T23:00:10.761+1:00 avvcbimage Info <16041>: VDDK:VixDiskLib: VixDiskLib_Cleanup: Remove previous mount points and clean up .

2013-12-30T23:00:10.761+1:00 avvcbimage Info <16041>: VDDK:VixDiskLib: VixDiskLib_Connect: Establish connection.

2013-12-30T23:00:10.761+1:00 avvcbimage Info <16041>: VDDK:VixDiskLib: VixDiskLib_Disconnect: Disconnect.
2013-12-30T23:00:10.761+1:00 avvcbimage Info <16048>: VixDiskLib_Cleanup() found no prior mounts to cleanup.
2013-12-30T23:00:10.781+1:00 avvcbimage Info <18664>: Login(https://vcenter.local:443/sdk) Datacenter: 'PRODUCTION'
2013-12-30T23:00:10.781+1:00 avvcbimage Info <0000>:      - connected to 'VirtualCenter' - version: 'VMware vCenter Server 5.5.0 build-1312298',  apiVersion:'5.5' 
2013-12-30T23:00:10.817+1:00 avvcbimage Info <14692>: Login test is OK.
2013-12-30T23:00:11.912+1:00 avvcbimage Warning <16004>: Soap fault detected, Could not connect to getHostMoref 'esx1.local'., Msg:'SOAP 1.1 fault: SOAP-ENV:Client [no subcode]
"End of file or no input: Operation interrupted or timed out"
Detail: [no detail]
'
2013-12-30T23:00:11.990+1:00 avvcbimage FATAL <0000>: [IMG0009] The Host 'esx1.local' is in a state that cannot perform snapshot operations.
2013-12-30T23:00:11.990+1:00 avvcbimage Info <9772>: Starting graceful (staged) termination, Host is in a state that cannot perform snapshots. (wrap-up stage)
2013-12-30T23:00:11.990+1:00 avvcbimage Error <0000>: [IMG0009] createSnapshot: snapshot creation  or pre/post snapshot script failed
2013-12-30T23:00:11.991+1:00 avvcbimage Error <0000>: [IMG0009] createSnapshot: snapshot creation/pre-script/post-script failed
2013-12-30T23:00:11.991+1:00 avvcbimage Info <9763>: snapshot created:false NOMC:false ChangeBlTrackingAvail:true UsingChBl:true
2013-12-30T23:00:11.991+1:00 avvcbimage Info <16041>: VDDK:VixDiskLib: VixDiskLib_EndAccess: Disk access completed.
2013-12-30T23:00:11.991+1:00 avvcbimage Info <16041>: VDDK:VixDiskLib: VixDiskLib_Connect: Establish connection.
2013-12-30T23:00:11.991+1:00 avvcbimage Info <16041>: VDDK:VixDiskLibVim: VixDiskLibVim_AllowVMotion: Enable VMotion.
2013-12-30T23:00:11.993+1:00 avvcbimage Info <16038>: Final summary, cancelled/aborted 0, snapview 1, exitcode 209: plugin error 09
2013-12-30T23:00:14.224+1:00 avvcbimage Info <17819>: VixDiskLib vMotion reservation successfully released

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of piyushranusri
piyushranusri
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of sevenet

ASKER

Hello,

Actually it was my post :-) Since we had a very serious issue with it I tried posting in many places hoping to find an answer as soon as possible. Indeed, increasing LUN size helped, but the above messages are ridiculous comparing to the real reason. I would have never guessed from that output that the problem was the LUN space. Anyway, still no one is able to answer my question why was this space a case :-)

Best regards,
Krzysztof