Batch FTP Hangs - Windows

I have searched FTP issues and see mostly FTP hangs closing connections or has issues with users sitting in front of DOS windows.  Ours in unattended batch jobs.

We cannot use MPUT as we have too many directories and new directories and files are added all the time.

We have used a VB6 to now VB.Net 2008 program to dynamically search for all files matching file pattern within a starting point on source drive, and we build all the FTP commands to navigate thru both source and destination servers.  Program has behaved very well for years.

After script built, we shell out to DOS and run FTP passing in script, and redirecting output to file.  Once FTP ends, we read file looking for errors and take action as needed.  We also allow some retries, but usuallly we cause job failure and notification is sent out.

Intermittent - say 5 times in 3 months and we run this process maybe 30 times a day for between 1 file to 3000 files for an FTP instance.  If 22 business days in a month, that is 5 issues out of 1980 runs.
So far, issue always well after connection, and after 10 to 2500 files already transfered, then it just hangs.  FTP output shows no errors, just hung on next file.
Task manager shows VB program there, but NO CPU cycles, shows FTP running and NO CPU cycles.  At first I noticed, as FTP runs, memory gets bigger and bigger so thought it maxed out and hung, but it hangs when running 50 files as well
We do run multiple jobs concurrently, usually not more than 2, but we use same credentials, same source and target servers, but not same source or target directories.
We do NOT see FTP hangs in development environment, does NOT hang on 2ndary Production environment, just hangs in Primary Production environment - or so it appears.
Dev & 2ndary Production do NOT use the FTP "gateway", thery connect firectly to the FTP server.  Only Primary Production uses FTP "gateway".

Factors that changed last 6 months this has been occuring:
We moved from older W2K servers to W2K3 servers
New servers in newer locations, target servers not as close - 7 hops...  maybe more for some environments.  Assumes different version of FTP on new boxes.  All boxes, old and new, 32bit.
Recompile on VS2008 from  VS2005
Entire corporation network "seems" (antidotal) slower.
We switched from an older FTP "gateway" that was over capacity to a newer FTP "gateway".  From our end, the name of the gateway changed, the hardware is beefier and redundant, but process remains the same.  Target directories are mapped as virtual directories to the gateway.  This did not change - but the gateway is new, and the target server are new.  Same virtual gateway names.

We are installing clumsy workarounds, but we would like to get at root cause.

Are there little known constraints on DOS and/or FTP?  Something beyond a well known insider issue that Microsoft FTP is notoriously unreliable and not secure.

I do not believe my homegrown code is to blame - as we just shell out and run native FTP in a DOS window.

Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Justin OwensITIL Problem ManagerCommented:
For my own edification, do you have to do this in a command shell, or can you do it programatically with VBScript or PowerShell?

Maybe your gateway has FTP flood threshold beyond which it limits or denies connections. Check the logs on the FTP gateway.
Unfortunately by relying on the DOS ftp client to do the core work you inherited one of its greatest flaws: no protocol-level logging.  

You could probably implement the entire process as a Robo-FTP command script that includes dynamic file/subfolder searching based on wildcard pattern, error trapping, automatic retries and email notifications on success/failure/status whatever.

Learning a new "scripting language" to solve a problem that happens less than 1% of the time might not be worth it, but I can say that the ability to easily log both the program logic and the protocol trace does WONDERS for your ability to troubleshoot these types of problems.  I open the Script Log in one window, the Trace Log in another, find the line where the problem happened, and align the time stamps in both windows so I can see exactly what happened "on the wire" when the problem began. This usually makes the problem quite obvious.
Active Protection takes the fight to cryptojacking

While there were several headline-grabbing ransomware attacks during in 2017, another big threat started appearing at the same time that didn’t get the same coverage – illicit cryptomining.

I would suggest running a packet capture to see if it is the client or the gateway that is having the problem.

I use wireshark (

It could be the timing, that is doing "x" number of tranfsers is "y" number of seconds.  Too many in too short of a time period could cause you to exceed open tcp connections allows to the gateway.
dsmrtnAuthor Commented:
I tried responding earlier but my satellite ISP was down - yes is true.  Typed it all out then did a submit and...nothing.

Years ago I looked into VB FTP classes and didn't like what I found in the public domain and I wasn't able to test 3rd party tools.  Now, it would be difficult to justify effort to convert to a VB solution given the low frequency of errors.

We are very limited on FTP choices.  We have a 3rd party choice for a more secure FTP if the gaterway is not used.  Use native FTP then use the gateway.  The Robo-FTP option seems very nice.

"giltrj" - are you saying each file transfered from a single FTP connection each opens a TCP connection?  To test that, I could create 50 or so 1-10 byte files and create a set of FTP statements that performs FTP PUT of these 50 files 10 times and it should hang?  And it would be repeatable?

We have the 3 environments - DEV (no gateway, no hangs), Prod Secondary (no gateway, no hangs), and Production Primary (gateway, hangs).  If it was a TCP connection issue, wouldn't we see FTP HANGS in all environments?

Yes, some FTP clients and servers will not re-use existing TCP connection if multiple transfers are being done using the same command connection (port 21).

You would not necessary see hangs in all environments.  It is not TCP itself that limits the number of connections, but the server used.  In your case you only have problems when using the gateway.  So it is possible that the gateway is limiting the number of TCP ftp-data connections a single client can have.

Since you only have the issue when going through the gateway  I would focus on the gateway.  A packet trace would help identify if the client is having the problem, or the server.  Do you have access to the gateway and its logs?  

It is possible that the gateway does not support re-using TCP ftp-data connections and thus if you do 50 transfers, it requires 50 unique TCP data connections.  If the gateway limits a single client to having say 49 connections, then if you try to do the 50th transfer before one of the other 49 connections have been terminated, then you could hang.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
dsmrtnAuthor Commented:
I will make inquiries about gateway configuration and possible limitations and respond back here.  Might take a few days or more. I have no access to this information and as might be expected in a corporate, large one size fits all environment, there are processes in place to ask questions, wait for questions to get assigned, wait for responses to questions.
dsmrtnAuthor Commented:
I feel bad about grading this like this.  Answr might be spot on but I have no idea.  I haven't fixed my issue, and I do not know if this suggestion is right or wrong.  I just know I'm not going to get anything from support groups before I'm hounded for a no activity question.

This at least offers insight to where the problem might be.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Networking Protocols

From novice to tech pro — start learning today.