Link to home
Start Free TrialLog in
Avatar of Papnou
PapnouFlag for United States of America

asked on

Veeam Ver 6 & vCenter 4.1 Job Failure Issues

I'm hoping that someone can give me a hand with this very frustrating error that I have been receiving for the past 2 days which has caused all of our backups to fail repeatedly. The exact error message is the following:

Error: Client error: File does not exist or locked. VMFS path: [[DataStoreName] Server1/Server1.vmx]. Please, try to download specified file using connection to the ESX server where the VM registered. Failed to create NFC download stream. NFC path: [nfc://conn:123.456.789.50,nfchost:host-65,stg:datastore-3433@Server1/Server1.vmx].

I have an open support request with veeam and vmware on this issue. Both vendors are pointing the finger at each other which is very frustrating.

The error seemed to be caused by our switch dying which in turn caused vcenter to lose communication to it's Service Console connection, ESX Hosts and all the vm's. Our entire company went down until we setup a new switch later that day. Everything magically started coming backup up again and working except that our backups failed that night and every night since then. We are running Veeam Version 6 Patch 3, vcenter 4.1, and ESX 4.1 update 2.

Veeam Tech Support is saying that there is an NFC communication issue that vmware should assist in resolving but vmware is saying that veeam is using their API incorrectly. Here is vmware's official responce "Unfortunately, VMware will not be addressing this issue until the next major release at this point, as from our perspective, the API in question is not actually reacting poorly, simply being used incorrectly. The API in question is called the CopyDatastoreFile_Task API, and is designed for use on files that are not locked by our VMFS Distributed Locking system. There are alternative APIs that are available for copying/accessing locked files appropriately (by farming the task to the lock-owning ESX host, or by using one of the alternative lock-slots, depending on what type of lock is in place). " 

Here are the things that I have tried before putting tickets in with both veeam and vmware.

1. Rebooted vcenter and veeam server
2. Rebooted esx servers to try and clear this lock.
3. Deleted and re-setup jobs on the veeam server.
4. Verified communication from veeam server to vcenter and all esx hosts.
5. Powered off vcenter server and veeam server.
6. vmotioned vm's to different esx hosts and also different datastores.
7. Restart mgmt agent and vcenter agent on all esx servers.

The jobs are still failing with that same issue.

The weird thing is that If I setup my ESX hosts by IP address individually in veeam I can back them up my vm's but just not through vcenter. I can also not download the .vmx file or any other file through vcenter. I can only download it directly through the esx host. This issue goes away after I reboot the host but comes back once veeam tries to run a backup job and fails.

I'm open to any suggestions that any one has.

Thanks in advance.
Avatar of Papnou
Papnou
Flag of United States of America image

ASKER

Another thing that I wanted to mention was that there is no service console lock on the file. I verified that by running lsof | grep command.  I did find the MAC address of the lock holder which is an unused NIC by running the vmkfstools -D (path to .vnx file command)  The result of that command is below.  

Lock [type 10c00001 offset 42524672 v 636, hb offset 3895296
gen 45, mode 1, owner 4f634c4d-63c2a440-1e08-001b2187be06 mtime 444]
Addr <4, 42, 12>, gen 13, links 1, type reg, flags 0, uid 0, gid 0, mode 755
len 3488, nb 1 tbz 0, cow 0, zla 2, bs 65536

MAC Address of owner = 001b2187be06 (vmnic10)

This is very strange to me how an un-used NIC not configured or plugged into anything can have a lock on a vm.  

Just wanted to post this in case anyone has run into this.  I have a feeling this is directly related to my post above.  Thanks.
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Check if servers IP addresses and names can be resolved. e.g DNS

or use local hosts files on the Backup Server.
Avatar of Papnou

ASKER

Thanks for the suggestion.  I have double verified that DNS is working correctly.  I can ping each ESX server by IP and by name.  Any other suggestions?
what about Reverse DNS, traceroute does it resolve the same?
Avatar of Papnou

ASKER

Yes, it does.  They both resolve the same.
ASKER CERTIFIED SOLUTION
Avatar of Papnou
Papnou
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Papnou

ASKER

Along with this posting on EE, I also put a ticket in with Veeam Tech Support.  They helped me fix this problem.  I just thought I should post the fix and close out this request.  Thanks.