Avatar of S-a-t
 asked on

create tar file backup in gz and keep for 180 days (6 months) before removing

Hi Experts!

I want to create a shell script to take compressed backup of ABCD.XYZ.*.tar and XYZ.ABCD.*.tar files and keep for 180 days ( 6 months ).
After 6 months they can be removed, latest 6 months files will be on the system and older should be removed.
Files are already .tar and generated daily. Each file is 50 MB approx.

How do I create shell script for this?

Thanks in Advance!
Shell ScriptingLinuxScripting LanguagesUnix OS

Avatar of undefined
Last Comment

8/22/2022 - Mon

View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.

What creates these foibles, adding a -z option would compress them at time of creation.
Then all that remains is to remove as Simon's script demonstartes.

You would stager I.e. Move files
30day old
60 day old
90 day old
120 day old
180 day old

The reason, if there is a large number of files the rm() \; could run into issues as there is a limit on how many/large tge list can be.
The above breakdown would limit the amount of data transitioning through each stage.

@Arnold number of files shouldn't be a problem.  With the '\;' argument at the end, the exec calls "rm" once for each file it finds. If I'd used "+" instead, that would have built up a long command line with multiple file names.

I would normally use something like
find  . -name "*.tar.gz" | xargs rm

Open in new window

to avoid calling rm many times and to avoid over-long command lines, but if this command is run every day there would usually only be one file to delete so the -exec will only call rm once anyway.

simon, yes i realize that using xargs would deal with ..... But I try to take the possibilities as it is not clear on what schedule these files are being created.
From the question there seem to be multiple processes involved which at some point culminate in the existence of a set of files ABCD.XYZ.*.TAR
which then need to be compressed. gzip will retain the original file's timestamp as listed.
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck

Sorry, I didn't explain that properly.  I used "find ... -exec" rather than "xargs" because there would probably only be a few files, and the overhead of repeated "rm" calls is small (my "find" command calls "rm" once for each file to be deleted, while the "xargs" version would call "rm" as few times as possible, getting as many file names on each "rm" call as are allowed by the OS).  Even if there are thousands of files to be deleted, the "-exec rm '{}' \;" will be called once for each file, so there is no danger of having a command too long.  The overhead of calling "rm" thousands of times is unfortunate, but it would still be fairly small ("rm" is quite quick to start).  The "find ... -exec" version also has the benefit that filenames with odd characters in them are handled properly - the "xargs" version will fail if filenames have spaces, newlines, pipe characters and so on in them (yes, all of those are allowed!), though there is a fix for that (give "-print0" as a "find" option, and "-0" as an "xargs" one).

I believe that is the opposite of what I think happens, -exec rm {} \; will pass the results of the find into the set {} and then rm with the .... list of files as though you called
rm file1 file2 file3
the find may have evolved since to handle the OS restriction on the number of files that can be passed on the command line.... to avoid the dreaded memory overrun error.
I think since then, I pass the results to a while read loop...........

I wasn't sure, so looked it up.

  -exec rm '{}' \;

will call "rm" once for each file

  -exec rm '{}' +

will build up a list of files for the "rm". I don't know whether it checks for maximum command line length.

Edit: just looked again. It *does* limit the command length, like xargs, so can be safely called with many filenames.
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.



That was a good idea. I didnt think that way.
I see only one file getting created daily which is 50 MB.

Thank you simon3270.

I am using this script in my environment.