Solved

Veeam Replication jobs fail with some sort of timeout

Posted on 2013-01-25
16
9,756 Views
Last Modified: 2013-05-20
I have a number of Veeam Replication jobs that replicate server VMs offsite. Some network changes were recently made (adding a NAT setup between the office LAN and the remote site) and that caused all manner of problems i.e. all my replication jobs failed. The changes were undone but out of my 5 jobs 2 of them still refuse to complete and they give the same error I was seeing prior to backing out of the NAT changes.

The job fails with the error "Error: Client error: ChannelError: TimedOut" and Veeam Support seem intent on blaming our WAN link.  However, as I said 3 of my jobs complete OK. I've also copied a VM from local to remote hosts and VMware didn't fall over and complain about a WAN problem.

Data size doesn't seem to be a common factor, neither does OS (one server is Win2003 the other Win2008), antivirus (one server has it the other doesn't) or anything else that I can see.

I am at a loss as to why these 2 jobs (which I've tried deleting and recreating from scratch as well) get maybe up to 40-50% done and then just give up.

Can anyone help?

Thanks
0
Comment
Question by:funasset
  • 10
  • 3
  • 3
16 Comments
 
LVL 5

Expert Comment

by:MaximVeeam
Comment Utility
Which version of Veeam Backup do you use?
0
 

Author Comment

by:funasset
Comment Utility
It reports 6.5.0.128 (64 bit)
0
 

Author Comment

by:funasset
Comment Utility
I think that's the latest they have?
0
 
LVL 5

Expert Comment

by:MaximVeeam
Comment Utility
Yes, you are right. My idea was that the software can be out-of-date. I am looking for a solution on Veeam Forums - http://forums.veeam.com/
0
 

Author Comment

by:funasset
Comment Utility
I set a job going on Friday and it worked fine for 11 hours then just gave up with the same error.
0
 

Author Comment

by:funasset
Comment Utility
Update:
I've been using a test VM (Windows 2008 Server Standard - clean install) to see if I can find some type of common denominator for the servers that fail to replicate.  My results are

Is problem related to..........
OS? It doesn't seem to be as failures have included various operating systems.
AV software being present? No - again jobs have failed regardless of AV software being present.
Data size? I don't know if Veeam just shoves the entire virtual disk down the wire or if it's dynamic and shoves whatever the used disk data size is. In my test I loaded my test server from clean (12Gb) to 150Gb and all jobs failed. 12Gb is less than some of the server jobs that have succeeded. To me this suggests that the amount of data involved is not a common factor.
Throttling? No - jobs have failed with Throttling On and Off.
Host datastore? No - job failure does not seem specfic to any one datastore.
WAN link - although Veeam Support seemed fixated on a WAN link problem this doesn't explain why some jobs succeed. Also, VMware can migrate a VM copy of a failed source server to the remote host without any trouble which suggests that the link is fine.

If anyone else has any other suggestions I'd be grateful as it's becoming a Royal pain!

Thanks
0
 
LVL 5

Expert Comment

by:MaximVeeam
Comment Utility
Could you please provide the ticket number?
0
 

Author Comment

by:funasset
Comment Utility
The problem seems to be down to bandwidth.  The remote host was retrieved and when on the local LAN all replication jobs completed OK bar the usual Veeam "features" of moaning about CBT/"Cannot use SOAP" and calculating 'disk digests' for a job that finished a only a few minutes earlier?

The acid test will be to see if the incremental replication jobs complete OK when the host is returned to the remote site and throttling is reinstated.
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 
LVL 4

Expert Comment

by:Michael Rodríguez
Comment Utility
Hi funasset,

Do you have a VEEAM proxy VM server setup on your destination ESXi host?  Having a proxy server greatly improves replication times.  The proxy server will then "hot add" the VMDK files of the servers you're replicating, assuming you configure the replication job properly.

I currently use VEEAM in house to replicate 10 VMs offsite.  Luckily our rate of change is really small (5-7GB usually) so replication finishes within 6-8 hours.

If you need more info on the proxy server config, let me know and I'll send you screenshots or whatever.

Hopefully this helps.
0
 

Author Comment

by:funasset
Comment Utility
Hi and thanks for the info.

Yes I do have a remote proxy available but I'm not sure if I have the seeding set up OK. Certainly since I've had the physical server back in-house the replication jobs have been working fine but being on the local LAN might not highlight any cockup I might have made in defining where the replication job seeds from.

Some screenshots would be very welcome - many thanks.
0
 
LVL 4

Expert Comment

by:Michael Rodríguez
Comment Utility
Hi funasset,

I've attached multiple screens of my configuration.  From what you stated, just make sure that during the config of the replication job, you're specifying the proper target proxy.

Let me know if you have any more questions.
veeam-proxy.jpg
veeam-job-successandinfo.jpg
veeam-replica-mapping.jpg
veeam-target-proxy.jpg
0
 

Author Comment

by:funasset
Comment Utility
Many thanks.

It seems that I have my config the same as yours. I think my problem is down to bandwidth. I don't have exclusive access to our feeble WAN link so when other processes run they seem to squeeze Veeam jobs out. I thought that the Throttling feature would somehow create a fixed pipe for Veeam to use but it seems not and if these other processes need more bandwidth they just take it.

Now the host has been retrieved and new full replications have been created locally I'm hoping that the WAN link will be able to cope with just incremental jobs once the server has been put back.  If they still fail then I guess I'll have to push someone to get a better link!

Thanks again.

To be continued.....................
0
 
LVL 4

Expert Comment

by:Michael Rodríguez
Comment Utility
No problem, funasset, hopefully you'll get the problem resolved soon.

Just curious, your replication target server, it doesn't have any issues in terms of a failed RAID card battery and/or failed disk, correct?

For example, the DL380 G5 I replicate to, if the Smart Array BBU failes on the RAID card, performance slows waaaay the hell down.

I know you said that you have your server locally again and speed seems to be fine when local, but you never know.
0
 

Author Comment

by:funasset
Comment Utility
It had some sort of problem a while ago and the RAID card was replaced. I've run diagnostics on it while it's been here and (if you believe Dell Diagnostics!) it claims that all is well.

I appreciate the thought!
0
 

Accepted Solution

by:
funasset earned 0 total points
Comment Utility
Since creating the new Veeam images with the remote host retrieved and sitting on the office LAN, incremental replication has been fine.  The problem appeared to be WAN related. Other processes required bandwidth on the WAN line and they were squeezing Veeam out.  In the end some traffic management values were tweaked in the VPN tunnel through which everything flows. Since then all the processes appear to be playing together nicely.
0
 

Author Closing Comment

by:funasset
Comment Utility
See previous post
0

Featured Post

What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
Running Baan iV on VMware 3 70
vCenter continuous  availability 13 44
Microsoft Lync 2013 4 41
Linux as a middle box 7 9
This is an issue that we can get adding / removing permissions in the vCSA 6.0. We can also have issues searching for users / groups in the AD (using your identify sources). This is how one of the ways to handle this issues and fix it.
In this article, I will show you HOW TO: Create your first Windows Virtual Machine on a VMware vSphere Hypervisor 6.5 (ESXi 6.5) Host Server, the Windows OS we will install is Windows Server 2016.
This tutorial will walk an individual through configuring a drive on a Windows Server 2008 to perform shadow copies in order to quickly recover deleted files and folders. Click on Start and then select Computer to view the available drives on the se…
After creating this article (http://www.experts-exchange.com/articles/23699/Setup-Mikrotik-routers-with-OSPF.html), I decided to make a video (no audio) to show you how to configure the routers and run some trace routes and pings between the 7 sites…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

9 Experts available now in Live!

Get 1:1 Help Now