Link to home
Start Free TrialLog in
Avatar of alexpa
alexpa

asked on

SCRIPT: sh cron script to selectivly delete files based on a timestamp in the file name

basically, i want a cron job to call a script, that will delete or NOT delete a file, from a list of files in a single directory, based on the timestamp in the file name. I will post what i have so far below, first an explanation.

1.output ls to a file
2.split that file into two, using find -mtime +30, to devide files older than 30 days and files newer than 30 days
3. a bunch of 'sed'ing to extract just the time part of the file name, leaving something like 20030316 (YYYYMMDD)

and now i am stuck, (other than my idea a LOT of awk/sedding and regular expressions) i want to:

on files older than 30 days, keep ONLY one, either the last friday of the month, or the youngest for that month. for example is there was 20030301 20030320 and 20033030 i would want to keep 20033030 and delete the others.

on files younger than 30 days, i want to keep one file per week, unless it is one week old in which case i want to keep everything.

here is what I have:

#!/bin/sh
#
# variables

BACKUPDIR=/blah/blah/
SYSTEMDATE=`date '+%m-%d-%y%n'`

# starts with a inventory of backup files

ls -l $BACKUPDIR > /tmp/fileoutput

# splits files between older and younger than 30 days old

find $BACKUPDIR -mtime +30 > /tmp/morethan30
find $BACKUPDIR -mtime -30 > /tmp/lessthan30

# split full and incremental files less than 30 days old

grep full /tmp/lessthan30 > /tmp/lessthan30full
grep incre /tmp/lessthan30 > /tmp/lessthan30incre


sed 's/\/blah\/blah\///g' /tmp/morethan30 |
sed 's/-blahblah.//g' |
sed 's/-1.tgz//g' |
sed 's/-full//g' |
sort -n > /tmp/morethan30all

sed 's/\/blah\/blah\///g' /tmp/lessthan30full |
sed 's/-blahblah.//g' |
sed 's/-1.tgz//g' |
sed 's/-2.tgz//g' |
sed 's/-3.tgz//g' |
sed 's/-4.tgz//g' |
sed 's/-5.tgz//g' |
sed 's/-full//g' |
sort -n > /tmp/lessthan30full


sed 's/\/blah\/blah\///g' /tmp/lessthan30incre |
sed 's/-blahblah.//g' |
sed 's/-1.tgz//g' |
sed 's/-2.tgz//g' |
sed 's/-3.tgz//g' |
sed 's/-4.tgz//g' |
sed 's/-5.tgz//g' |
sed 's/-incremental//g' |
sort -n > /tmp/lessthan30incre

sed -n '/200[2-9][ ][1-21]/p' /tmp/morethan30all
Avatar of heskyttberg
heskyttberg

Hi!

I don't really understand what you are trying here, but if you are doing something like this.

Let's say you backup mon-fri.

In your backup script program, either make two sets, backup sets or scripts.

Make the script that backup mon-thu backup with a one extension and the fri backup with another.

Then do something like this:
find $BACKUPDIR -mtime -7 -name *.mon-thu-ext -exec rm -f {}
find $BACKUPDIR -mtime -35 -name *.fri-ext -exec rm -f {}

I'm not 100% sure about the find syntax, but anyway, count't you do something like the above to save yourself som pain.

If you really would want to do the above, I think you should make a perl script instead since there is so much string parsing going on.

Regards
/Hans - Erik Skyttberg
Avatar of alexpa

ASKER

sorry if that was not clear. basically, we have an existing backup that makes backups everyday. we want a script that will go through and delete all of the old ones, if it matches a criteria. that criteria is, if any backup is older than 30 days, than all will be deleted, UNLESS it is the one from the last friday of the month. so 200[1-3][1-2][1-24] would be deleted, but 200[1-3][1-2][25-31] would not. i tried using regular expressions, but i could not get a proper match, i was wondering if there were a better way to do it

thanks!
Hi!

I don't think there is any easy way of doing this.
I also hope this  backup is going onto a raided and tape backuped disk or at east a mirrored set and not onto the same disk you are backing up.

I don't see why you need to keep the last friday of each month ?

Figuring out what date each friday in every month is, that is your problem and also makes it about impossible to do with sed. You really need a perl script to do this.

I still think you have a strange backup strategy. Is the friday backup a FULL backup and that is why you want to keep it ?

Regards
/Hans - Erik Skyttberg
Avatar of alexpa

ASKER

yeah, we have a backup of everyone on tape, but to conserve hard disk space, we want to only keep a single full backup for anything older than a month. Why the last friday? just to have an 'end of the week' copy, but that doesnt matter so much, the last day of the month works as well.

we have daily FULL backups, every day, and we want to keep them all (on disk) unless they are older than 30 days old. so, the script will find the oldest of each month, and than nuke the rest.
Hi!

Why not just do something like this:
find /my_back_dir -mtime -32 -exec rm -f {}

This will delete all files older than 32 days. If they are on tape anyway, why not even do this.
find /my_back_dir -mtime -7 -exec rm -f {}

You will have 7 days backup on disk, and the rest will be on tape. This means if you ever need to back anything older than 7 days you need to get the tape, but how often does this happen ?

I worked for 5 years with both windows and unix and never had a need to back anything older than 7 days. The oldest restore we ever needed to do was 5 days.

I try to keep things as simple and easy as possible when doing administration.

Your situation might be diffrent, but still try to keep it simple, another solution would be as described above giving friday backups one type of TAG in the filename and all others another and do two find rm thingies. Or even put them in one dir each.

Regards
/Hans - Erik Skyttberg
Avatar of alexpa

ASKER

Well true that we usually would only want backups from less than 7 days, but management says, based on insurance, that we have 1 full backup, on DISK, for every month.

So the question is the same, I thank you for thinking that I may want to do something different, but what I want to do is still the same. Delete all old backups EXCEPT one that is the last friday of each month.
ASKER CERTIFIED SOLUTION
Avatar of heskyttberg
heskyttberg

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of alexpa

ASKER

Well, it looks like I either need to learn perl or just make a seperate cron to copy the friday backups to a different dir. Thanks,

Alex