Solved

Batch FTP Hangs - Windows

Posted on 2011-03-10
9
1,702 Views
Last Modified: 2013-11-29
I have searched FTP issues and see mostly FTP hangs closing connections or has issues with users sitting in front of DOS windows.  Ours in unattended batch jobs.

We cannot use MPUT as we have too many directories and new directories and files are added all the time.

We have used a VB6 to now VB.Net 2008 program to dynamically search for all files matching file pattern within a starting point on source drive, and we build all the FTP commands to navigate thru both source and destination servers.  Program has behaved very well for years.

After script built, we shell out to DOS and run FTP passing in script, and redirecting output to file.  Once FTP ends, we read file looking for errors and take action as needed.  We also allow some retries, but usuallly we cause job failure and notification is sent out.

Symptoms:
Intermittent - say 5 times in 3 months and we run this process maybe 30 times a day for between 1 file to 3000 files for an FTP instance.  If 22 business days in a month, that is 5 issues out of 1980 runs.
So far, issue always well after connection, and after 10 to 2500 files already transfered, then it just hangs.  FTP output shows no errors, just hung on next file.
Task manager shows VB program there, but NO CPU cycles, shows FTP running and NO CPU cycles.  At first I noticed, as FTP runs, memory gets bigger and bigger so thought it maxed out and hung, but it hangs when running 50 files as well
We do run multiple jobs concurrently, usually not more than 2, but we use same credentials, same source and target servers, but not same source or target directories.
We do NOT see FTP hangs in development environment, does NOT hang on 2ndary Production environment, just hangs in Primary Production environment - or so it appears.
Dev & 2ndary Production do NOT use the FTP "gateway", thery connect firectly to the FTP server.  Only Primary Production uses FTP "gateway".

Factors that changed last 6 months this has been occuring:
We moved from older W2K servers to W2K3 servers
New servers in newer locations, target servers not as close - 7 hops...  maybe more for some environments.  Assumes different version of FTP on new boxes.  All boxes, old and new, 32bit.
Recompile on VS2008 from  VS2005
Entire corporation network "seems" (antidotal) slower.
We switched from an older FTP "gateway" that was over capacity to a newer FTP "gateway".  From our end, the name of the gateway changed, the hardware is beefier and redundant, but process remains the same.  Target directories are mapped as virtual directories to the gateway.  This did not change - but the gateway is new, and the target server are new.  Same virtual gateway names.

We are installing clumsy workarounds, but we would like to get at root cause.

Are there little known constraints on DOS and/or FTP?  Something beyond a well known insider issue that Microsoft FTP is notoriously unreliable and not secure.

I do not believe my homegrown code is to blame - as we just shell out and run native FTP in a DOS window.

0
Comment
Question by:dsmrtn
9 Comments
 
LVL 31

Expert Comment

by:DrUltima
ID: 35100308
For my own edification, do you have to do this in a command shell, or can you do it programatically with VBScript or PowerShell?

DrUltima
0
 
LVL 5

Expert Comment

by:xylog
ID: 35143835
Maybe your gateway has FTP flood threshold beyond which it limits or denies connections. Check the logs on the FTP gateway.
0
 
LVL 16

Expert Comment

by:AlexPace
ID: 35145958
Unfortunately by relying on the DOS ftp client to do the core work you inherited one of its greatest flaws: no protocol-level logging.  

You could probably implement the entire process as a Robo-FTP command script that includes dynamic file/subfolder searching based on wildcard pattern, error trapping, automatic retries and email notifications on success/failure/status whatever.

Learning a new "scripting language" to solve a problem that happens less than 1% of the time might not be worth it, but I can say that the ability to easily log both the program logic and the protocol trace does WONDERS for your ability to troubleshoot these types of problems.  I open the Script Log in one window, the Trace Log in another, find the line where the problem happened, and align the time stamps in both windows so I can see exactly what happened "on the wire" when the problem began. This usually makes the problem quite obvious.
0
 
LVL 57

Expert Comment

by:giltjr
ID: 35176775
I would suggest running a packet capture to see if it is the client or the gateway that is having the problem.

I use wireshark (http://www.wireshark.org).

It could be the timing, that is doing "x" number of tranfsers is "y" number of seconds.  Too many in too short of a time period could cause you to exceed open tcp connections allows to the gateway.
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 

Author Comment

by:dsmrtn
ID: 35181107
I tried responding earlier but my satellite ISP was down - yes is true.  Typed it all out then did a submit and...nothing.

Years ago I looked into VB FTP classes and didn't like what I found in the public domain and I wasn't able to test 3rd party tools.  Now, it would be difficult to justify effort to convert to a VB solution given the low frequency of errors.

We are very limited on FTP choices.  We have a 3rd party choice for a more secure FTP if the gaterway is not used.  Use native FTP then use the gateway.  The Robo-FTP option seems very nice.

"giltrj" - are you saying each file transfered from a single FTP connection each opens a TCP connection?  To test that, I could create 50 or so 1-10 byte files and create a set of FTP statements that performs FTP PUT of these 50 files 10 times and it should hang?  And it would be repeatable?

We have the 3 environments - DEV (no gateway, no hangs), Prod Secondary (no gateway, no hangs), and Production Primary (gateway, hangs).  If it was a TCP connection issue, wouldn't we see FTP HANGS in all environments?

0
 
LVL 57

Accepted Solution

by:
giltjr earned 250 total points
ID: 35181286
Yes, some FTP clients and servers will not re-use existing TCP connection if multiple transfers are being done using the same command connection (port 21).

You would not necessary see hangs in all environments.  It is not TCP itself that limits the number of connections, but the server used.  In your case you only have problems when using the gateway.  So it is possible that the gateway is limiting the number of TCP ftp-data connections a single client can have.

Since you only have the issue when going through the gateway  I would focus on the gateway.  A packet trace would help identify if the client is having the problem, or the server.  Do you have access to the gateway and its logs?  

It is possible that the gateway does not support re-using TCP ftp-data connections and thus if you do 50 transfers, it requires 50 unique TCP data connections.  If the gateway limits a single client to having say 49 connections, then if you try to do the 50th transfer before one of the other 49 connections have been terminated, then you could hang.

0
 

Author Comment

by:dsmrtn
ID: 35184969
I will make inquiries about gateway configuration and possible limitations and respond back here.  Might take a few days or more. I have no access to this information and as might be expected in a corporate, large one size fits all environment, there are processes in place to ask questions, wait for questions to get assigned, wait for responses to questions.
0
 

Author Closing Comment

by:dsmrtn
ID: 35346892
I feel bad about grading this like this.  Answr might be spot on but I have no idea.  I haven't fixed my issue, and I do not know if this suggestion is right or wrong.  I just know I'm not going to get anything from support groups before I'm hounded for a no activity question.

This at least offers insight to where the problem might be.
0

Featured Post

Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Introduction: Recently, I got a requirement to zip all files individually with batch file script in Windows OS. I don't know much about scripting, but I searched Google and found a lot of examples and websites to complete my task. Finally, I was ab…
ADCs have gained traction within the last decade, largely due to increased demand for legacy load balancing appliances to handle more advanced application delivery requirements and improve application performance.
Viewers will learn how to properly install and use Secure Shell (SSH) to work on projects or homework remotely. Download Secure Shell: Follow basic installation instructions: Open Secure Shell and use "Quick Connect" to enter credentials includi…
After creating this article (http://www.experts-exchange.com/articles/23699/Setup-Mikrotik-routers-with-OSPF.html), I decided to make a video (no audio) to show you how to configure the routers and run some trace routes and pings between the 7 sites…

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now