We have a W2k8 FTP server (IIS) in our DMZ (PIX firewall) that accepts thousands of ftp transmissions daily from our mainframe (inside our network), and from our laptops (outside the network).
Typically, a connection is made, credentials successfully authenticate, a transmission begins, then ends when complete.
This is a sample of a typical successful transaction from our ftp server logs:
14:29:49 10.0.3.1 USER curtiscc.net\mfdownload 331 0
14:29:49 10.0.3.1 PASS - 230 0
14:29:49 10.0.3.1 CWD \curtis\Watch_n_launch\in 250 0
14:29:49 10.0.3.1 created /curtis/Watch_n_launch/in/
14:29:51 10.0.3.1 created /curtis/Watch_n_launch/in/
In this transmission, two files were successfully delivered to the ftp server (a .zip and .ins).
Intermittently (maybe once every several days), at precisely 23 seconds after the connection is made and successful authentication, the transmission times-out with a “10060” ftp error. That ftp thread terminates and continues with the next step or thread successfully, as if nothing was wrong.
This is a sample of a failed transaction from our ftp server logs:
14:30:01 10.0.3.1 USER curtiscc.net\mfdownload 331 0
14:30:01 10.0.3.1 PASS - 230 0
14:30:01 10.0.3.1 CWD \curtis\Watch_n_launch\in 250 0
14:30:24 10.0.3.1 created PRINT_APGM999_110819102956
.zip 425 10060
14:32:03 10.0.3.1 created /curtis/Watch_n_launch/in/
14:32:03 10.0.3.1 QUIT - 226 0
In this transmission, the “10060” timeout occurs exactly 23 seconds after the successful logon and directory change. Then after it fails, the same thread continues to the next step and successfully delivers the next file. Additionally, it seems to be “thread or connection-specific” (not a pause of the whole ftp service) because we have seen it where, within the 23 second inactivity period, a new thread starts and completes before the original thread eventually times-out with “10060” error.
Our IIS FTP timeout is set to 5 minutes. We’ve disabled kaspersky anti-virus, as well as windows firewall, so it seems to be none of those. Disk is de-fragged. This happened when the server was physical, although right now it's virtualized (vmware). It happens various times (day/night) so it’s not related to when we do our nightly backup exec tape backup.
Just on a whim, we tweaked the tcp/ip settings per this (but it didn’t solve the problem) :
So, what’s causing this? Windows, IIS, PIX firewall?
Thanks for your any thoughts, this is driving us crazy!
When the FTP fails, our MF operator has to call the programmer (sometimes 1am, 3am) to inspect and rerun the job.
BTW: I'm the programmer, so PLEASE: help me get my beauty sleep!