Solved

FTP 23 second timeouts!!

Posted on 2011-09-09
15
1,590 Views
Last Modified: 2012-05-12
Background:

We have a W2k8 FTP server (IIS) in our DMZ (PIX firewall)  that accepts thousands of ftp transmissions daily from our mainframe (inside our network), and from our laptops (outside the network).

Typically, a connection is made, credentials successfully authenticate, a transmission begins, then ends when complete.

This is a sample of a typical successful transaction from our ftp server logs:
   14:29:49 10.0.3.1 [10019]USER curtiscc.net\mfdownload 331 0
   14:29:49 10.0.3.1 [10019]PASS - 230 0
   14:29:49 10.0.3.1 [10019]CWD \curtis\Watch_n_launch\in 250 0
   14:29:49 10.0.3.1 [10019]created /curtis/Watch_n_launch/in/PRINT_APGM999_110819102948.zip 226 0
   14:29:51 10.0.3.1 [10019]created /curtis/Watch_n_launch/in/PRINT_APGM999_110819102948.ins 226 0
In this transmission, two files were successfully delivered to the ftp server (a .zip and .ins).

Symptom:  
Intermittently (maybe once every several days), at precisely 23 seconds after the connection is made and successful authentication, the transmission times-out with a “10060” ftp error. That ftp thread terminates and continues with the next step or thread successfully, as if nothing was wrong.  

This is a sample of a failed transaction from our ftp server logs:
   14:30:01 10.0.3.1 [10020]USER curtiscc.net\mfdownload 331 0
   14:30:01 10.0.3.1 [10020]PASS - 230 0
   14:30:01 10.0.3.1 [10020]CWD \curtis\Watch_n_launch\in 250 0
   14:30:24 10.0.3.1 [10020]created PRINT_APGM999_110819102956.zip 425 10060
   14:32:03 10.0.3.1 [10020]created /curtis/Watch_n_launch/in/PRINT_APGM999_110819102956.ins 226 0
   14:32:03 10.0.3.1 [10020]QUIT - 226 0

In this transmission, the “10060” timeout occurs exactly 23 seconds after the successful logon and directory change.  Then after it fails, the same thread continues to the next step and successfully delivers the next file.  Additionally, it seems to be “thread or connection-specific” (not a pause of the whole ftp service) because we have seen it where, within the 23 second inactivity period, a new thread starts and completes before the original thread eventually times-out with  “10060” error.

Our IIS FTP timeout is set to 5 minutes.  We’ve disabled kaspersky anti-virus, as well as windows firewall, so it seems to be none of those. Disk is de-fragged.  This happened when the server was physical, although right now it's virtualized (vmware).   It happens various times (day/night) so it’s not related to when we do our nightly backup exec tape backup.

Just on a whim, we tweaked the tcp/ip settings per this (but it didn’t solve the problem) :
http://kb.globalscape.com/Print10438.aspx

So, what’s causing this?  Windows, IIS, PIX firewall?

Thanks for your any thoughts, this is driving us crazy!

When the FTP fails, our MF operator has to call the programmer (sometimes 1am, 3am) to inspect and rerun the job.  

BTW: I'm the programmer, so PLEASE: help me get my beauty sleep!

Mike
0
Comment
Question by:mike2401
  • 8
  • 4
  • 3
15 Comments
 
LVL 16

Accepted Solution

by:
AlexPace earned 250 total points
Comment Utility
10060 is not an FTP protocol-level error..  it looks like maybe the FTP protocol error was 425... failure opening the data channel.

Try to get a protocol-level trace log from both the client and the server.  If passive mode is used I'm guessing the port range is not 100% clear... maybe you've got 100 ports in the range but a firewall hole only 99 wide.  

If you are using an FTP client like Robo-FTP, you could test the return code for error condition and automatically retry failed transfers... then the problem would fix itself and nobody will call you in the middle of the night.
0
 
LVL 29

Assisted Solution

by:Randy Downs
Randy Downs earned 250 total points
Comment Utility
There are metabase timeouts but the defaults are higher than 23 seconds. In any case, your issue is intermittent.

Have you considered another FTP server?

http://www.microsoft.com/technet/prodtechnol/WindowsServer2003/Library/IIS/31a2f39c-4d59-4cba-905c-60e7af657e49.mspx?mfr=true

Setting Connection Timeouts by Using IIS Manager
You can set global connection timeouts for the WWW or FTP service, or for individual Web sites and FTP sites. You can also set global connection timeouts on SMTP and NNTP servers. For more information about setting connection timeouts, see Setting Connection Timeouts.

Setting Connection Timeouts by Editing the Metabase
IIS 6.0 provides three metabase properties, ConnectionTimeout, HeaderWaitTimeout, and MinFileBytesPerSec, which you can use to set different types of connection timeouts. In IIS 6.0, these properties replace the ServerListenTimeout metabase property, which is no longer used for the WWW service but can be used for the FTP, SMTP, and NNTP services.

Setting connection timeouts
The ConnectionTimeout metabase property specifies the amount of time (in seconds) that the server waits before disconnecting an inactive connection. IIS applies this timeout limit after the client sends the first request to the server and the client is idle. The default value is 120 seconds for the WWW and FTP services (global settings); 120 seconds for individual Web and FTP sites; and 10 minutes for the SMTP and NNTP services. (In IIS Manager, when you change the value of the ConnectionTimeout property, you change this setting.)

For security reasons, the ConnectionTimeout property cannot be disabled. Thus, if you try to set the ConnectionTimeout property to 0, the property retains its previous setting.

Setting request timeouts
The HeaderWaitTimeout metabase property specifies the amount of time (in seconds) that the server waits for the client computer to send all HTTP headers for a request (indicated by a double carriage return) before HTTP.sys resets the connection. The purpose of this property is to help impede a type of denial of service attack that attempts to exhaust connection limits and keep those connections connected. You can apply this connection timeout only at the WWW service level.

For security reasons, the HeaderWaitTimeout property cannot be disabled. Thus, if you try to set the HeaderWaitTimeout property to 0, the property retains its previous setting.

Setting response timeouts
TheMinFileBytesPerSecmetabase property determines the length of time that the client has to receive the server's entire response to its request. If the client computer does not receive the entire HTTP response within the interval set by the time-out value (by default, 240 bytes per second), HTTP.sys terminates the connection. You can apply this connection timeout only at the WWW service level.

Configuring the MinFileBytesPerSec metabase property prevents a client computer from sending a request for a large response (such as a file download) and then receiving the response at a maliciously slow rate that is meant to consume resources on the server and potentially interrupt service for other client computers.

The time-out period is calculated by dividing the size of the entire response (including headers) by the value of the MinFileBytesPerSec property to obtain a maximum allowable response time, in seconds. For example, a 2-KB response (2,048 bytes) is allowed 8.5 seconds to complete if MinFileBytesPerSec has the default value of 240 bytes per second.

To accommodate very slow applications, you can disable the MinFileBytesPerSec property by setting the value to 0.


Reference to Default Time-out Settings
Additional IIS 6.0 metabase properties set time-out values for ASP, Common Gateway Interface (CGI) scripts, and Internet database connection pooling. Table 6.11 gives a summary of the metabase properties for setting timeouts and the default time-out limit for each property. For information about configuration options, see Code Examples to Configure Metabase Properties. The final column of the table indicates which properties can alternatively be updated in IIS Manager.

Table 6.11 Default Time-out Values for IIS 6.0
Metabase Property Default Time-Out Value Configured in IIS Manager
AspQueueTimeout
 Unlimited
 
 
AspScriptTimeout
 90 seconds
 Yes
 
AspSessionTimeout
 20 minutes
 Yes
 
CGITimeout
 300 seconds
 Yes
 
ConnectionTimeout
 120 seconds (Web and FTP);

10 minutes (SMTP and NNTP)
 Yes
 
HeaderWaitTimeout
 None (Turned off by default.)
 
 
MinFileBytesPerSec1
 240 bytes per second
 
 
PoolIdcTimeout
 None (Turned off by default.)
 
 


1 This metabase property cannot be modified in IIS Manager, but it can be modified by adding the MinFileBytesPerSec entry to the Windows registry.

For more information about ASP–related properties and counters, see Monitoring ASP Performance. For information about the registry path for the MinFileBytesPerSec entry, see Global Registry Entries.

Another way to limit connections to your Web server is to use bandwidth throttling. For information, see Throttling Bandwidth to Manage Service Availability. A related way to manage resources is to limit the number of simultaneous connections to your sites and server. For information about limiting connections, see Limiting Connections to Manage Resources.

0
 
LVL 29

Expert Comment

by:Randy Downs
Comment Utility
You might try switching to from Active to Passive  or vice versa

http://learn.iis.net/page.aspx/309/configuring-ftp-firewall-settings/

It is often challenging to create firewall rules for FTP server to work correctly, and the root cause for this challenge lies in the FTP protocol architecture. Each FTP client requires two connections to be maintained between client and server:

•FTP commands are transferred over a primary connection called the Control Channel, which is typically the well-known FTP port 21.
•FTP data transfers, such as directory listings or file upload/download, require a secondary connection called Data Channel.
Opening port 21 in a firewall is an easy task, but this means that an FTP client will only be able to send commands, not transfer data. This means that the client will be able to use the Control Channel to successfully authenticate and create or delete directories, but the client will not be able to see directory listings or be able to upload/download files. This is because data connections for FTP server are not allowed to pass through the firewall until the Data Channel has been allowed through the firewall.

Note: This may appear confusing to an FTP client, because the client will seem to be able to successfully log in to the server, but the connection may appear to timeout or stop responding when attempting to retrieve a directory listing from the server.

The challenges of working with FTP and firewalls doesn't end with the requirement of a secondary data connection; to complicate things even more, there are actually two different ways on how to establish data connection:

•Active Data Connections: In an active data connection, an FTP client sets up a port for data channel listening and the server initiates a connection to the port; this is typically from the server's port 20. Active data connections used to be the default way of connecting to FTP server; however, active data connections are no longer recommended because they do not work well in Internet scenarios.

•Passive Data Connections: In a passive data connection, an FTP server sets up a port for data channel listening and the client initiates a connection to the port. Passive connections work much better in Internet scenarios and recommended by RFC 1579 (Firewall-Friendly FTP).
Note: Some FTP clients require explicit action to enable passive connections, and some clients don't even support passive connections. (One such example is command-line Ftp.exe utility that ships with Windows.) To add to the confusion, some clients attempt to intelligently alternate between the two modes when network errors happen, but unfortunately this does not always work.

Some firewalls try to remedy problems with data connections with built-in filters that scan FTP traffic and dynamically allow data connections through the firewall. These firewall filters are able to detect what ports are going to be used for data transfers and temporarily open them on firewall so that clients can open data connections. (Some firewalls may enable filtering FTP traffic by default, but it is not always the case.) This type of filtering  is known as a type of Stateful Packet Inspection (SPI) or Stateful Inspection, meaning that the firewall is capable of intelligently determine the type of traffic and dynamically choose how to respond. Many firewalls now employ these features, including the built-in Windows Firewall.
0
 

Author Comment

by:mike2401
Comment Utility
Thanks everyone.

Our LAN admin is reviewing these suggestions.

Since making the tcp/ip tweaks, oddly, we've seen an increase in failures on the DIR statement which precedes the actual transfer (not sure if it's a coincidence or not).

Since we did get another 10060 error this morning, I'm ready to conclude the tcp/ip tweaks are not the solution.

BTW, the FTP transmissions from our Mainframe are scripted and copied/pasted all over the place, so it's not easy to change all the mainframe jobs.

Our laptops are using a custom VB control in a custom app, so it's not something we can change.

I've never had a problem using filezilla, but then again, this happens 0.5% of the time so it's hard to replicate.
 
These FTP failures are SOOO frustrating.

Mike
0
 

Author Comment

by:mike2401
Comment Utility
Our lan-admin guy found this yesterday:  (pasted below):

"Yesterday, I found some interesting articles about a component called “Receive-Side Scaling” (RSS) that is enabled by default on Win2008 servers (“netsh int tcp show global”).  Briefly, RSS balances receive traffic across multiple CPUs by offloading the data from the NIC to the CPUs.  All modern NICs support RSS and is enabled by default. This is fine for physical servers.  BUT, when your Win2008 server is virtual, with a virtual NIC (vmware) that does not support RSS, it is recommended to disable RSS to avoid problems with the OS attempting to perform a function that is unsupported.  This may lead to intermittent failures, slowdowns, dropped connections, etc. To disable RSS, the command is “netsh int tcp set global rss=disabled”.

So I disabled RSS and we have not had an incident in 17 hours.  I’m keeping my fingers crossed.   "
0
 
LVL 16

Expert Comment

by:AlexPace
Comment Utility
Interesting.  Please post an update if it goes a week without trouble.
0
 

Author Comment

by:mike2401
Comment Utility
Sadly, we went 3 days without any error, but had two ftp 23 second timeouts last night.

Interestingly, since we've done the tcp tweaking,, it appears we're getting timeout failures on DIR (list) commands which are included in the job.

I never remember getting errors on that, (normally just on the put statements)

Mike

PS:  I opened up a paid incident with microsoft this morning.
0
What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

 

Author Comment

by:mike2401
Comment Utility
*******************SOLVED**************************

(i hope)

It turns out there is a confirmed bug in the cisco pix os version we are running.

Upgrading to:  v 6.3.5 to supposed to fix the FTP bug.

We will upgrade on Monday.

I'll let everyone know.

BTW, what is proper expert's exchange etiquette:

I want to acknowledge everyone who replied, but if the solution is the upgrade to pix, what am I supposed to do about points?
 
0
 
LVL 29

Expert Comment

by:Randy Downs
Comment Utility
If one or more people helped you find the problem then you should give them the points or share them.

If not the experts will understand.
0
 

Author Comment

by:mike2401
Comment Utility
So, if the solution was completely unrelated to the kind contributions of the experts, how do I close the call without awarding points to anyone?
0
 
LVL 29

Expert Comment

by:Randy Downs
Comment Utility
I think you can award no points or award them to your own solution.
0
 

Author Comment

by:mike2401
Comment Utility
We upgraded the PIX software last night.  Keeping our fingers crossed.  No failures last night (but that doesn't mean anything).

Mike
0
 
LVL 16

Expert Comment

by:AlexPace
Comment Utility
I still think it was a failure opening the data channel.  The default windows timeout waiting for a SYN ACK is 21 seconds. If you give it a second to get ready to open the data channel and a second to time out and start writing to the log then your at 23.  That said, if your firewall is supposed to open the data channel port on the fly but doesn't this could be misconfigured or it could be one of the things that might be fixed by updating the firewall.
0
 

Author Comment

by:mike2401
Comment Utility
It looks like it was indeed the cisco pix.  The upgrade to  v 6.3.5   solved it

Rajesh and Richard from Microsoft were great, really did a great job of helping us rule out our IIS, and of following up.

So as not to offend any of the experts who took the time to offer suggestions, I'm going to spit points.

Thanks to everyone for their suggestions.

Mike
0
 

Author Closing Comment

by:mike2401
Comment Utility
Even though the problem turned out to be our pix firewall, I very much want to thank everyone for their help.

Regards,
Mike
0

Featured Post

Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

Join & Write a Comment

If you don't have the right permissions set for your WordPress location in IIS, you won't be able to perform automatic updates. Here's how to fix the problem.
Join Greg Farro and Ethan Banks from Packet Pushers (http://packetpushers.net/podcast/podcasts/pq-show-93-smart-network-monitoring-paessler-sponsored/) and Greg Ross from Paessler (https://www.paessler.com/prtg) for a discussion about smart network …
After creating this article (http://www.experts-exchange.com/articles/23699/Setup-Mikrotik-routers-with-OSPF.html), I decided to make a video (no audio) to show you how to configure the routers and run some trace routes and pings between the 7 sites…
Here's a very brief overview of the methods PRTG Network Monitor (https://www.paessler.com/prtg) offers for monitoring bandwidth, to help you decide which methods you´d like to investigate in more detail.  The methods are covered in more detail in o…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now