Link to home
Start Free TrialLog in
Avatar of JDCam
JDCam

asked on

FTP from LINUX. Too many files?

Experts,
I have an application on a linux server that writes some output files into a single directory. With recent growth in business that directory has about 147,000 files in it at any time.  These are mostly temp print files that are continually purged

I have a handful of users that use FTP clients (WS_FTP and Filezilla) to copy specific files from linux to their PC's. This is not working anymore, the FTP clients are only displaying a fraction of the total files.

I am not sure what is going on. Is there a limitation on either the FTP client of Linux server? Is some type of network or firewall issue preventing listing of all files? Is the issue at the desktop level?

As a temp solution, I am forced to access the Linux server and copy the file into a new (near empty) directory. The user can then grab the file from there no problem.
Avatar of masnrock
masnrock
Flag of United States of America image

Check the configuration of the server. Which Linux distro and version are you using?
Avatar of JDCam
JDCam

ASKER

Linux version 4.1.12-61.1.28.el6uek.x86_64 (mockbuild@x86-ol6-builder-06) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-16) (GCC) ) #2 SMP Thu Feb 23 20:03:53 PST 2017

LSB Version:    :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
Distributor ID: OracleServer
Description:    Oracle Linux Server release 6.9
Release:        6.9
Codename:       n/a
Also forgot to ask which FTP server you're using on that server.
If I understand what you're saying, trying to have 100,000s of files displaying in an FTP directory listing will likely fail in all sorts of ways.

FTP just wasn't meant to handle this number of files efficiently.

Simple solution, no FTP user will be downloading print files, so move these to a different directory.

Another tip, when purging these files, using find . {args} -delete is far more efficient than any other method. I ended up with some client with... less than intelligent developers, who were dumping small log files in one directory, with no pruning. This resulted in nearly 5,000,000 files which had to be removed. The find -delete trick was the only method that worked, without taking the machine down.

The solution you mention using, is similar to what I suggest + better to move the temp print files elsewhere, because your solution requires walking the entire directory to match + move files. This can take a huge amount of machine resource + be very slow, for large numbers of files.
ASKER CERTIFIED SOLUTION
Avatar of E C
E C
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of JDCam

ASKER

Let me clarify... Every file in the directory is a temp print file. Depending on the type, some are purged at 24 hrs, some at 7 days, some at 60 days and others 1 yr. The number of files on-hand is directly tied to how busy we are (# of print jobs).

Users might access at most 10-12 files per week. Unknown which file will be needed until they ask for it, so trying to redirect in advance would not work.

The FTP client method has been in place for 10 to 12 years with no issues until a few weeks ago. Seems we hit a tipping point.
If there's no way to control the number of files (like you said, it's tied to how busy you are), then what if your ftp home directory has 12 folders - one for each month of the year... 01, 02, 03 ... 12

Then as each file gets uploaded to FTP, have a script that automatically moves the incoming file into the appropriate folder.

So at any given moment your home folder has nothing in it except 12 folders.
Basically you're dividing your files into folders so you don't overload the FTP server and the linux os.

I'm sure a linux guru could help with the script.
Or look to the FTP server software to do it. For example, check out crushftp.com  It has a lot of automation options.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of JDCam

ASKER

Thanks,
I am unable to modify the directories without involving the Software company. And I will likely reccomend such changes for a future release. In the interim, I am stuck with what I have.

I agree that the issue is on the client side FTP program unable to handle so many files.
Playing around with Filezilla, I found that although the file may not be listed, the search function finds it no problem. I will simply show the users to use search instead of of scrolling through the files (which is more efficient anyways). \
congratulation! The work around solution is cool.

Anyway, Keep a directories structure needn't involving the software company.
You just need a periodic run script which scan the files and move then to sub direcotries by some rules.