sFTP from Windows 2012 R2 Fails

TL;DR Version-
For months now we have a process that transvers a CSV file via an sFTP connection through the public Internet that periodically fails. The only way to get it going again is to log into the server, you don't have to do anything but login.

The Problem-
We have a routine that runs every 15 minutes on a Windows 2012 Server that transfers a CSV file to an sFTP server on the public Internet. We are using a product called GoAnywhere Director. Periodically, and not on any clearly discernible timetable, that job will fail and it will report "connection refused". Initially I thought the problem was on the sFTP server side and out of my control.

The sFTP admin reported no such errors so I began packet captures. I ran captures on my Cisco ASA but never saw any packets coming from the Windows server to the sFTP server. I logged into the server and the next scheduled job was a success. I ran a packet capture on that server but it was all successful, so I left it running for a few days.

After a while it failed again. I logged into the server to troubleshoot, but just as before it started working again. I checked the packet capture and I see where we are sending an ACK but Wireshark reports that the sFTP server is sending a RST, ACK, then there is a spurious retransmission.

I don’t believe that the reset is actually from them, despite the packet capture reporting such. Reason being is that during successful periods the time between the initial SYN and its next packet is 0.036972 seconds while the time between packets during a failure is 0.00795 seconds. Furthermore, the next line device (the firewall) should have seen the packet requests come through but it never sees anything. Please refer to the attachment.

Thinking it was a service provider issue we switched to another provider (I have multiple links to choose from) but the issues continued. Next step was to move the process to another server using a different application (WinSCP) to transmit the file, but it too has the same problem as GoAnywhere Director on Windows 2012. This server is a Windows 2008 R2 (the original was a 2012 R2).

I thought the problem was Explicid Congestion Notification (ECN) so I disabled that on the Windows 2012 server, but that didn't correct my issue.

Right now as the issue occurs, I receive an email about a failed attempt then I log into the server. That works, but that also means I can't take a day off...
Thomas PitchfordAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

How are you checking to see if the firewall did or did not see the the ACK?

What firewall are you using?  Some firewalls have a time limit on how long a connection can stay open and will terminate it and it does not matter if the connection is transferring data or not.
Thomas PitchfordAuthor Commented:
We have a Cisco ASA and I am using the packet capture functionality of it. During periods where the transmission is successful I see the packets. During periods where my application reports failures I don't see any packets.

Everything is pointing to the Windows server as the problem, though I don't understand why. All I do is login to the server and it starts working!

I have setup another sFTP routine that I am using to compare my production routine with. I expect both of the routines to fail at the same time. The only problem is that the failures occur so randomly. It could be days before we have a failure, or it could fail in the next hour...
I'm not familiar with GoAnywhere Director.  Does it run as a service?  What account does it run under?  Could it be set up as a scheduled task for a particular account?  If so, do you RDP to it and then disconnect?  If so, could it be that the rules have changed for disconnected RDP accounts?

Best regards...Paul
Need More Insight Into What’s Killing Your Network

Flow data analysis from SolarWinds NetFlow Traffic Analyzer (NTA), along with Network Performance Monitor (NPM), can give you deeper visibility into your network’s traffic.

Thomas PitchfordAuthor Commented:
It's an application that runs as a service, it's run as a domain user account. It has its own internal job engine that executes the jobs. There is no way to use the Windows Task Scheduler.

The other server has the same problem and that process is actually triggered through Microsoft SQL, it is scripted to run through WinSCP. It is my other test machine that has the same problems.
Do you have managed switches?  Could you mirror the port the sFTP client computer is on?
Thomas PitchfordAuthor Commented:
That server is in VMware. Not a bad suggestion, I will set that up tomorrow. I have a packet capture from the servers themselves, both report the same spurious retransmissions
Since you can see the packets leaving the client's OS, but they never appear to hit the firewall, you need to try and do packet captures at each point you can in between the two.
Thomas PitchfordAuthor Commented:
To my amazement I saw failures this morning. I ran a packet capture from another VM on the same ESXi server and saw the failures. I then ran a packet capture from the ESXi host to ensure the host is seeing the traffic. All captures this morning showed the same set of frames as I originally attached.

I'm not running packet captures from the switches in which my ESXi host is connected. There are two switches before that traffic makes it to the outside network devices.

I will report my findings.

Thanks for your input.
I've just noticed your posted capture screenshot - doh.  Good job giltjr is on the case.  Do you have a trace of it working - in particular when the client connects to the service on TCP Port 22?
Looking at the screen shot from the first packet capture I just noticed that Wireshark reports spurious re-transmits of the SYN packets.

This means the Wireshark sees the SYN packets going out and the RST/ACK packets coming in to the "NIC", but not being passed on to TCP.  If TCP saw the RST/ACK packets it would reset the connection and notify the SFTP client.

This leads me to  beleive that somehow a firewall, or some other security software, is intercepting and dropping the RST/ACK packets before they are passed to TCP.
Thomas PitchfordAuthor Commented:
I was able to resolve my issue. For some reason our Barracuda Web Filter was blocking the traffic. I added the IP address of the sFTP server to the excluded IP and it has been working just fine.

I'm not happy with the resolution Barracuda gave me as to why it happened, but none the less this problem has been resolved.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Thomas PitchfordAuthor Commented:
My solution worked for my problem.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Windows Server 2012

From novice to tech pro — start learning today.