Solved

Linux Script to remove files on specific directories with exclusions

Posted on 2016-10-21
31
71 Views
Last Modified: 2016-10-31
Hello,

I've been using a SFTP Centos 7.2 server using SSH service and I have to apply a script to remove files from several specific directories and sub-directories older than 30 days. In addition, I need to have the ability to exclude several sub-directories so that files won't get removed.

As an example, in my home folder I have 4 folders:
/home/a01/SFTPWRITE
/home/a02/SFTPWRITE
/home/a03/SFTPWRITE
/home/a04/SFTPWRITE

I need to remove all the files and directories older than 30 days from the SFTPWRITE on each of the 4 a01/a02/a03/a04 folders and to exclude several folders within SFTPWRITE folders as well so, let's say I need to exclude /home/a01/SFTPWRITE/exc01 folder.

Any advise how can I set this script up as a cron job?

Sorry guys if my question is basic but please bear with me as I'm a Linux beginner.

Thanks
0
Comment
Question by:Nitsan Reznik
  • 13
  • 9
  • 7
  • +1
31 Comments
 
LVL 37

Expert Comment

by:Gerwin Jansen
ID: 41855340
You could use a find command to find all files older than 30 days first. Then I would create a list that contains the files and folders to exclude. Filter both lists to create a list of files to delete and then perform the delete.
0
 

Author Comment

by:Nitsan Reznik
ID: 41855343
Thank you for your comment. May I ask you to share an example of the command I shall be using as I'm quite confused of the command switches for finding and removing
0
 
LVL 76

Expert Comment

by:arnold
ID: 41855362
Ssh username@remotehost 'find /path/to/dir/of_interest -mtime +30 -ls'
This will return a list files meeting the criteria older than 30 days.
If you are satisfied replace -ls with -exec rm {}\;

That should do it. -mtime uses the last modify time.
0
 

Author Comment

by:Nitsan Reznik
ID: 41855366
Wow thank you so much for your comment. How can I exclude folders?
0
 
LVL 76

Expert Comment

by:arnold
ID: 41855373
One option is instead of exec
You can pass the results to grep -v
find /path -mtime +30 | grep -v "(exclude1|exclude2|exclude3)" | rm {}\;
The difficulty with the above is that if there are too many items, it may exceed  a large number, rm
An option is to use an intermediary that will run rm on a file at a time.
-name can be use for include pattern in find.
find /path -name "pattern" -mtime +30

Any option to setup scripts on the remote sides to manage the content through cron?
0
 
LVL 37

Expert Comment

by:Gerwin Jansen
ID: 41855376
@Arnold - you could put the folders to exclude in an exclude file and use grep -vf exclude_file - I'm not sure the | rm {}\; at the end will work because of the pipes.
0
 
LVL 50

Accepted Solution

by:
Steve Bink earned 500 total points
ID: 41855387
First thing: read the man page for the find utility.  This will find all files matching your search terms.  One thing you might notice is missing from find's options is the ability to find files based on creation time.  AFAIK, most Linux file systems don't really expose the file *creation* time.  Find has options for last access (atime), last status change (ctime), and last change (mtime)... that's as close as you'll get it.  If you don't ever chmod these files, though, ctime should work just fine for your purposes.  To find all files under the current directory older than 30 days, for example:
find . -ctime 720

Open in new window

Once you have the list of files, you could just pass find's -delete switch, but you also want to exclude and include files only under specific directories.  Find's normal output is a one-line-per-file listing of full paths, so we should be able to use grep to do some filtering.  We'll use the pipe ability to send find's output through grep.  All of these files will be under an SFTPWRITE subdirectory, so we can look for that first:
find /home -ctime 720 | grep '/SFTPWRITE/'

Open in new window

You also want to exclude certain directories.  We can use grep again, but this time we'll use an inverted search (return only lines which *don't* match the pattern), and use a regular expression to provide some flexibility.  This example will filter out any file paths which include the exc01, exc02, or exc03 subdirectories:
find /home -ctime 720 | grep '/SFTPWRITE/' | grep -vE '/(exc01|exc02|exc03)/'

Open in new window

Finally, you want to delete each file in the curated list.  For that, we call on xargs:
find /home -ctime 720 | grep '/SFTPWRITE/' | grep -vE '/(exc01|exc02|exc03)/' | xargs -I{} /bin/rm {}

Open in new window

Two other notes:
  • If your filenames are likely to have odd characters, such as space, newline, tabs, etc., add the -print0 option to find, and the -0 option to xargs.
  • I highly recommend not adding the xargs command until after you have verified that you absolutely, positively, 100%-guaranteed want to delete the returned list.  This operation is not reversible.
0
 

Author Comment

by:Nitsan Reznik
ID: 41855414
Thank you chaps! I will give it a whirl and let you know.
0
 

Author Comment

by:Nitsan Reznik
ID: 41858154
Hi Steve,

from some reasons the -ctime string doesn't seem to be giving any output back. but -mtime +30 did the trick.
0
 
LVL 76

Expert Comment

by:arnold
ID: 41858455
What number did you use with -ctime?
ctime is creation time. Mtime is last modified.
0
 

Author Comment

by:Nitsan Reznik
ID: 41858493
I used 720.

Basically, I need to find all the files older than 30 days but, it looks like -mtime +30 does it. it's odd that -ctime doesn't work for me.
0
 
LVL 76

Expert Comment

by:arnold
ID: 41858611
720 means 720 days since the file was created; almost two years.

the find argument for -ctime -mtime etc.
+n older than n days ago
n equal to number of days; will only display files that were created exactly 720 days before the query/search was run...
-n since n days ago

 look at find manual pages. (man find)
0
 

Author Comment

by:Nitsan Reznik
ID: 41858618
Hi,

So, in my case, I need to type 30 right?
0
 
LVL 50

Expert Comment

by:Steve Bink
ID: 41858886
arnold is correct - that should be 30, not 720.  I calculated hours for some reason..  That'll teach me to answer questions before I'm fully awake.

Also, do try to use -ctime, not -mtime.  The -ctime test examines when the file's status was changed, while -mtime examines the the file's data was changed.  If you do not change a file's status (e.g., chmod, chown, etc), -ctime should match the original creation date.  The -mtime date can be affected by a wide variety of actions, some of which are not immediately intuitive.
0
 
LVL 76

Expert Comment

by:arnold
ID: 41858927
Gif you want to locate files that were exactly n days use n, if you want files older than n days ago use +n if you need files N days old or newer use, -n

If not mistaken ctime deals when the file was originally created. Mtime deals when the file was last modified, atime is when the file was last accessed which is as soon as you list, ......unless your mount point includes nostimes updates.
0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 
LVL 50

Expert Comment

by:Steve Bink
ID: 41859603
>>> ctime deals when the file was originally created

Not quite..  most nix system don't actually record that information.  Instead, they record the timestamp of the last change in file status, such as permissions and ownership.  If the file's status has not been changed since its creation, though, it coincides with the creation date.

http://www.linux-faqs.info/general/difference-between-mtime-ctime-and-atime
0
 
LVL 76

Expert Comment

by:arnold
ID: 41859697
Steve, after so many years, found another reference that crept as an incorrect reference;I always use mtime for such searches.
But ctime is as you noted the files status change, so a chmod 600 file, will change the ctime record to now. Possibly a chown other_user file will update the ctime records as well.
0
 
LVL 50

Expert Comment

by:Steve Bink
ID: 41859736
I think it's just down to preference and perception.  Neither data point is what's needed here, and either could meet the standard given equally probable circumstances.  I lean towards data changes being more common, but we're both right.  :)
0
 

Author Comment

by:Nitsan Reznik
ID: 41859934
Thank you guys. Using -ctime +30 does make it more sense to be used in my case.
0
 

Author Comment

by:Nitsan Reznik
ID: 41865787
Hey Guys,

I was trying to make it work however, it has given me an error when running below command

find /home -ctime +30 | grep '/SFTPWRITE/' | grep -vE '/(imibms|ctv/6.03.22.62.18|ctv/6.03.22.62.19)/' | xargs -I{} /bin/rm {}

AS a reminder, I would like to remove the files inside directories and sub-directories older then 30 days.
we have /home and under /home all the clients names and then inside each of them, a folder called SFTPWRITE where it's the only area upload and download is allowed and hence all the data written.


[root@sftpuk /]# find /home -ctime +30 | grep '/SFTPWRITE/' | grep -vE '/(imibms                                                                                                                     |ctv/6.03.22.62.18|ctv/6.03.22.62.19)/' | xargs -I{} /bin/rm {}
/bin/rm: cannot remove â/home/inder/SFTPWRITE/FIC/FileZilla_3.14.1_win64/FileZil                                                                                                                     la-3.14.1/localesâ: Is a directory
/bin/rm: cannot remove â/home/inder/SFTPWRITE/NielsenUSâ: Is a directory
/bin/rm: cannot remove â/home/inder/SFTPWRITE/OzTamâ: Is a directory
/bin/rm: cannot remove â/home/inder/SFTPWRITE/TiVoâ: Is a directory
/bin/rm: cannot remove â/home/inder/SFTPWRITE/TVBeatâ: Is a directory
/bin/rm: cannot remove â/home/inder/SFTPWRITE/VLâ: Is a directory
/bin/rm: cannot remove â/home/internal/SFTPWRITE/HotFIXâ: Is a directory
/bin/rm: cannot remove â/home/alldevelopers/SFTPWRITE/124â: Is a directory
/bin/rm: cannot remove â/home/alldevelopers/SFTPWRITE/398222 testâ: Is a directo                                                                                                                     ry
/bin/rm: cannot remove â/home/alldevelopers/SFTPWRITE/CR 383913 testâ: Is a dire                                                                                                                     ctory
/bin/rm: cannot remove â/home/alldevelopers/SFTPWRITE/Gaborâ: Is a directory
/bin/rm: cannot remove â/home/alldevelopers/SFTPWRITE/Jtrachman/Companyâ: Is a d                                                                                                                     irectory
/bin/rm: cannot remove â/home/alldevelopers/SFTPWRITE/Liorâ: Is a directory
/bin/rm: cannot remove â/home/alldevelopers/SFTPWRITE/R100â: Is a directory
/bin/rm: cannot remove â/home/alldevelopers/SFTPWRITE/SRamaiahâ: Is a directory
/bin/rm: cannot remove â/home/alldevelopers/SFTPWRITE/vjoshiâ: Is a directory
/bin/rm: cannot remove â/home/allclients/SFTPWRITE/IBMS/doc/Feature Summariesâ:                                                                                                                      Is a directory
/bin/rm: cannot remove â/home/allclients/SFTPWRITE/IBMS/doc/IBMS_V6_Changes_Summ                                                                                                                     aries/Archiveâ: Is a directory
/bin/rm: cannot remove â/home/allclients/SFTPWRITE/IBMS DOTNETâ: Is a directory
/bin/rm: cannot remove â/home/allclients/SFTPWRITE/IBMS DOTNET/ODACâ: Is a direc                                                                                                                     tory
/bin/rm: cannot remove â/home/allclients/SFTPWRITE/Oracle/Win32â: Is a directory
/bin/rm: cannot remove â/home/allclients/SFTPWRITE/Oracle/Win64â: Is a directory
/bin/rm: cannot remove â/home/allclients/SFTPWRITE/Proposer Dataâ: Is a director                                                                                                                     y
/bin/rm: cannot remove â/home/allclients/SFTPWRITE/Proposer Data/MACSCâ: Is a di                                                                                                                     rectory
/bin/rm: cannot remove â/home/advantech/SFTPWRITE/exportsâ: Is a directory
/bin/rm: cannot remove â/home/bbcglobalnews/SFTPWRITE/6.03.16.07.01â: Is a direc                                                                                                                     tory
/bin/rm: cannot remove â/home/bbcglobalnews/SFTPWRITE/6.03.16.16â: Is a director                                                                                                                     y
/bin/rm: cannot remove â/home/bbcglobalnews/SFTPWRITE/6.03.16.17â: Is a director                                                                                                                     y
/bin/rm: cannot remove â/home/bbcglobalnews/SFTPWRITE/BBCWW_export_090309â: Is a                                                                                                                      directory
/bin/rm: cannot remove â/home/bbcglobalnews/SFTPWRITE/BBCWW_export_120108â: Is a                                                                                                                      directory
/bin/rm: cannot remove â/home/bbcwservice/SFTPWRITE/6.03.12.74â: Is a directory
/bin/rm: cannot remove â/home/bbcwservice/SFTPWRITE/6.03.12.75â: Is a directory
/bin/rm: cannot remove â/home/bbcwservice/SFTPWRITE/6.03.16.16â: Is a directory
/bin/rm: cannot remove â/home/bbcwservice/SFTPWRITE/6.03.16.16/380160002â: Is a                                                                                                                      directory
/bin/rm: cannot remove â/home/bbcwservice/SFTPWRITE/interfacesâ: Is a directory
/bin/rm: cannot remove â/home/bbcwservice/SFTPWRITE/newmediaâ: Is a directory
/bin/rm: cannot remove â/home/bbtv/SFTPWRITE/6.04.02.11.05â: Is a directory
/bin/rm: cannot remove â/home/bbtv/SFTPWRITE/6.04.02.11.10â: Is a directory
/bin/rm: cannot remove â/home/bbtv/SFTPWRITE/6.04.02.11.13â: Is a directory
/bin/rm: cannot remove â/home/bbtv/SFTPWRITE/6.04.02.11.12â: Is a directory
/bin/rm: cannot remove â/home/cbs/SFTPWRITE/6.04.08.16.11â: Is a directory
/bin/rm: cannot remove â/home/cbs/SFTPWRITE/6.04.08.16.12â: Is a directory
xargs: unmatched single quote; by default quotes are special to xargs unless you                                                                                                                      use the -0 option
/bin/rm: cannot remove â/home/cbs/SFTPWRITE/BRE_meeting_recordingsâ: Is a direct                                                                                                                     ory
0
 
LVL 76

Expert Comment

by:arnold
ID: 41865806
To remove directories, you have to add -r argument to rm command but be cautious ........ With its use.
0
 

Author Comment

by:Nitsan Reznik
ID: 41865809
Hey,

Just a clarification here, assuming I don't want to remove the directories only the files inside the directories, the command I entered above fulfills this goal?
0
 
LVL 76

Expert Comment

by:arnold
ID: 41865824
Yes, the error is more informational in this case that one of the items being passed to rm is a directory that can not be removed with the arguments it is being envoke.

The xargs error, seem to point to a file/directory that includes quotes and suggest the use of -0 to have xargs handle items that include quotes.
Quotes are either part of the filename that was erroneously sAved with them .....
1
 

Author Comment

by:Nitsan Reznik
ID: 41865825
Excellent stuff!
0
 
LVL 50

Expert Comment

by:Steve Bink
ID: 41865985
Also, you can add the `-type f` selector to the find command:
find /home -type f-ctime +30 | grep '/SFTPWRITE/' | grep -vE '/(imibms|ctv/6.03.22.62.18|ctv/6.03.22.62.19)/' | xargs -I{} /bin/rm {}

Open in new window

That will make sure find is only returning listings for actual files.
0
 

Author Comment

by:Nitsan Reznik
ID: 41866963
This is what I get at the end:

xargs: unmatched single quote; by default quotes are special to xargs unless you use the -0 option

Shall I use -O?
0
 
LVL 50

Expert Comment

by:Steve Bink
ID: 41867033
Yes, and make sure you're using "-<zero>", not "-<capital-oh>".
0
 

Author Comment

by:Nitsan Reznik
ID: 41867047
This is what I was typing but it's giving me an invalid option

find /home -mtime +30 | grep '/SFTPWRITE/' | grep -vE '/(imibms|ctv/6.03.22.62.18|ctv/6.03.22.62.19)/' | xargs -I{} /bin/rm {} -0
0
 
LVL 50

Expert Comment

by:Steve Bink
ID: 41867067
Try:
find /home -mtime +30 | grep '/SFTPWRITE/' | grep -vE '/(imibms|ctv/6.03.22.62.18|ctv/6.03.22.62.19)/' | xargs -0 -I{} /bin/rm {}

Open in new window

0
 

Author Comment

by:Nitsan Reznik
ID: 41867091
this is what I get:

find /home -mtime +30 | grep '/SFTPWRITE/' | grep -vE '/(imibms|ctv/6.03.22.62.18|ctv/6.03.22.62.19)/' | xargs -0 -I{} /bin/rm {}


xargs: argument line too long
0
 
LVL 76

Expert Comment

by:arnold
ID: 41867141
You could check whether you have files that include quotes in the name and how that might have happened, or continuing to happen.....


Skip the -0 option xargs as it seems to be a single file reference or that informational error would have appeared as many times as the corresponding files.
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

How many times have you wanted to quickly do the same thing to a list but found yourself typing it again and again? I first figured out a small time saver with the up arrow to recall the last command but that can only get you so far if you have a bi…
If you have a server on collocation with the super-fast CPU, that doesn't mean that you get it running at full power. Here is a preamble. When doing inventory of Linux servers, that I'm administering, I've found that some of them are running on l…
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

706 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now