Link to home
Start Free TrialLog in
Avatar of Thomas Pitchford
Thomas Pitchford

asked on

sFTP from Windows 2012 R2 Fails

TL;DR Version-
For months now we have a process that transvers a CSV file via an sFTP connection through the public Internet that periodically fails. The only way to get it going again is to log into the server, you don't have to do anything but login.


The Problem-
We have a routine that runs every 15 minutes on a Windows 2012 Server that transfers a CSV file to an sFTP server on the public Internet. We are using a product called GoAnywhere Director. Periodically, and not on any clearly discernible timetable, that job will fail and it will report "connection refused". Initially I thought the problem was on the sFTP server side and out of my control.

Troubleshooting-
The sFTP admin reported no such errors so I began packet captures. I ran captures on my Cisco ASA but never saw any packets coming from the Windows server to the sFTP server. I logged into the server and the next scheduled job was a success. I ran a packet capture on that server but it was all successful, so I left it running for a few days.

After a while it failed again. I logged into the server to troubleshoot, but just as before it started working again. I checked the packet capture and I see where we are sending an ACK but Wireshark reports that the sFTP server is sending a RST, ACK, then there is a spurious retransmission.

I don’t believe that the reset is actually from them, despite the packet capture reporting such. Reason being is that during successful periods the time between the initial SYN and its next packet is 0.036972 seconds while the time between packets during a failure is 0.00795 seconds. Furthermore, the next line device (the firewall) should have seen the packet requests come through but it never sees anything. Please refer to the attachment.

Thinking it was a service provider issue we switched to another provider (I have multiple links to choose from) but the issues continued. Next step was to move the process to another server using a different application (WinSCP) to transmit the file, but it too has the same problem as GoAnywhere Director on Windows 2012. This server is a Windows 2008 R2 (the original was a 2012 R2).

I thought the problem was Explicid Congestion Notification (ECN) so I disabled that on the Windows 2012 server, but that didn't correct my issue.

Right now as the issue occurs, I receive an email about a failed attempt then I log into the server. That works, but that also means I can't take a day off...
Capture.PNG
Avatar of giltjr
giltjr
Flag of United States of America image

How are you checking to see if the firewall did or did not see the the ACK?

What firewall are you using?  Some firewalls have a time limit on how long a connection can stay open and will terminate it and it does not matter if the connection is transferring data or not.
Avatar of Thomas Pitchford
Thomas Pitchford

ASKER

We have a Cisco ASA and I am using the packet capture functionality of it. During periods where the transmission is successful I see the packets. During periods where my application reports failures I don't see any packets.

Everything is pointing to the Windows server as the problem, though I don't understand why. All I do is login to the server and it starts working!

I have setup another sFTP routine that I am using to compare my production routine with. I expect both of the routines to fail at the same time. The only problem is that the failures occur so randomly. It could be days before we have a failure, or it could fail in the next hour...
I'm not familiar with GoAnywhere Director.  Does it run as a service?  What account does it run under?  Could it be set up as a scheduled task for a particular account?  If so, do you RDP to it and then disconnect?  If so, could it be that the rules have changed for disconnected RDP accounts?

Best regards...Paul
It's an application that runs as a service, it's run as a domain user account. It has its own internal job engine that executes the jobs. There is no way to use the Windows Task Scheduler.

The other server has the same problem and that process is actually triggered through Microsoft SQL, it is scripted to run through WinSCP. It is my other test machine that has the same problems.
Do you have managed switches?  Could you mirror the port the sFTP client computer is on?
That server is in VMware. Not a bad suggestion, I will set that up tomorrow. I have a packet capture from the servers themselves, both report the same spurious retransmissions
Since you can see the packets leaving the client's OS, but they never appear to hit the firewall, you need to try and do packet captures at each point you can in between the two.
To my amazement I saw failures this morning. I ran a packet capture from another VM on the same ESXi server and saw the failures. I then ran a packet capture from the ESXi host to ensure the host is seeing the traffic. All captures this morning showed the same set of frames as I originally attached.

I'm not running packet captures from the switches in which my ESXi host is connected. There are two switches before that traffic makes it to the outside network devices.

I will report my findings.

Thanks for your input.
I've just noticed your posted capture screenshot - doh.  Good job giltjr is on the case.  Do you have a trace of it working - in particular when the client connects to the service on TCP Port 22?
Looking at the screen shot from the first packet capture I just noticed that Wireshark reports spurious re-transmits of the SYN packets.

This means the Wireshark sees the SYN packets going out and the RST/ACK packets coming in to the "NIC", but not being passed on to TCP.  If TCP saw the RST/ACK packets it would reset the connection and notify the SFTP client.

This leads me to  beleive that somehow a firewall, or some other security software, is intercepting and dropping the RST/ACK packets before they are passed to TCP.
ASKER CERTIFIED SOLUTION
Avatar of Thomas Pitchford
Thomas Pitchford

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
My solution worked for my problem.