Two Part Question:
I have an Ubuntu FTP server that I want to generate a test report on files (with full path) that are 90 days or older. The second part would then be to delete any files older than 90 days with an exception list of a directories to be ignored.
I have setup a mirror with rsync to test these scripts out of production and have done some Google searches and not sure which way to go. Some suggest using -ctime or -mtime. Then there is the debate over using -exec over xargs to perform the deletion.
The FTP site has thousands of files so I'm looking for the most efficient method to first generate the report, then remove files and directories older than 90 days, with some directories to be excluded, where exclusions are read from a text file so that they can be changed as needed for specific projects.
Any suggestions or further reading specific to what I am trying to accomplish would be greatly appreciated. I'm comfortable with command line but really inexperienced with bash scripting and this is an area I really need to brush up on for automating tasks like these.
ASKER
This is harder than it looks...
Using mtime is better while using ctime may not work. If a backup program has backed up the files, it will cause ctime to advance if it resets atime. If your backup program does reset atime in this manner, then recent atime values indicate that someone (other than the backup program) is reading the file and perhaps it should not be removed. So, assuming a reasonable backup policy, atime may be the best choice.
Now consider a directory structure like:
./a/b/c/datafile
If datafile is recently created while a/b/c was already existent, directories a and a/b are stable while a/b/c is being updated. Doing a "rm -rf ./a" just because ./a has not changed will forcibly remove ./a/b/c/datafile which, in this case, is a recent file. Using rmdir for directories solves that. But we must be prepared for the fact that ./a must be left alone even though if passes the -ctime test.
On the other hand, if ./a/b/c/datafile is all old stuff, removing datafile renders ./a/b/c recently changed. Building a complete list of removal candidates prior to removing anything solves that.
We need process ./a/b before we process ./a and this implies that we need -depth on the find statement.
We cannot divide the world into ordinary files and directories unless we really want the script to fail if we encounter a socket, fifo, special file, etc. Instead we need to think in terms of directories and non-directories.
So maybe something like this will get you closer, (but I have not tested it):
Open in new window