We are in the process of setting up SQL DB mirroring between a production location (primary) and a data center (DC) (the actual DB mirroring is being handled by a third-party - we're responsible for connectivity between the two locations).
The primary location has a DS3 running through a Netgear GB switch and then to (2) Sonicwall NSA 240 UTMs in a high-availability cluster. The data center has a 20Mbps burstable pipe with the identical NSA240 configuration (Cisco desktop switch instead of Netgear). I'm getting consisitent upload speeds from the primary location at just under 40Mbps; the download speed at the DC is 20Mbps (also very consistently). We have configured a VPN tunnel between the two locations:
Policy Type: Site to Site
Authentication Method: IKE using Preshared Key
Exchange: Main Mode
DH Group: Group2
Life Time: 28800
There are no errors or warnings being logged in the Sonicwall regarding the IPSEC tunnel and connectivity between the two locations seems pretty solid.
There are (2) main servers that we are concerned with at the primary location. Both are Dell PowerEdge R900 servers that are about 2y old and still under full warranty. They are both dual Intel Xeon processors and both have 16GB of RAM. Both are running Windows Server 2008 Standard x64 with Service Pack 2 (NOT 2008 R2) and both are fully up-to-date with Windows Updates. We have deployed identically configured Virtual Servers at the Data Center (VMware EsXi 4.1 on Dell PowerEdge R900 host).
The issue is that when we try to transfer large files (>100MB) from DBSERVER1 (primary) to DBSERVER2 (DC), the transfer ALMOST always fails. I say almost because we've been able to transfer a 100MB file with fair success and a 1.7GB file one time but it has never been consistent. At this point, transferring anything over 100MB is pretty sure to meet with failure. The transfer begins just fine and shows that it's moving at about 2.5MB/s; then it locks up for a period of time before finally failing with the following error message:
"Item Not Found
Could not find this item
This is no longer located in \\SERVERNAME\SHARENAME\. Verify the item's location and try again."
Sometimes the transfer will resume from lockup but then it will show that it's moving at about 500-900KB/s; it may even look to restart completely from the beginning but it generally doesn't get above 1MB/s transfer rate. There does not appear to be any error/event logged in either the Application or System logs at a time that would correspond to the failure.
Initial thoughts were that there were issues with the Sonicwall not being able to handle the throughput. We put this to the test by choosing other devices at the primary location and moving files FROM these locations. We can consistently (without failure or error) move files >500MB from at least one XP workstation and from a Windows Server 2003 x64 server to the desired destination at the DC. We believe this rules out the Sonicwall as we were able to transfer a 22GB .bak file from the 2003 server in about 3h last night.
We also considered NIC traffic as the DB server at the primary location was handling what seemed to be a heavy network load. We updated the driver and teamed the (2) Intel PRO 1000 NICs to maximize throughput and we stopped any other non-essential transfers. This did not fix the issue so We used Windows Server 2008 Data Collector Sets for both LAN Diagnostics and System Performance - the NIC load does not appear out-of-the-ordinary. We did see that there were many System Diagnostics entries on the source server under Nework>IP "Datagrams Received Address Errors" (over 1000) and "Datagrams Received Discarded" (nearly 500). I couldn't begin to find a correlation here, though.
So from what we can see, the issue is that we can't move large files consistently between (2) Windows Server 2008 x64 Standard servers across a VPN tunnel. Oh yeah, to make it more complicated, we can move the same files around on the LAN - NOT over the VPN.
This is a time sensitive issue as the client is looking to have this done yesterday afternoon. Any assistance would be greatly appreciated. If there's anything I haven't covered or need to clarify, please ask.