?
Solved

SCRIPT: sh cron script to selectivly delete files based on a timestamp in the file name

Posted on 2003-03-14
8
Medium Priority
?
1,115 Views
Last Modified: 2013-12-27
basically, i want a cron job to call a script, that will delete or NOT delete a file, from a list of files in a single directory, based on the timestamp in the file name. I will post what i have so far below, first an explanation.

1.output ls to a file
2.split that file into two, using find -mtime +30, to devide files older than 30 days and files newer than 30 days
3. a bunch of 'sed'ing to extract just the time part of the file name, leaving something like 20030316 (YYYYMMDD)

and now i am stuck, (other than my idea a LOT of awk/sedding and regular expressions) i want to:

on files older than 30 days, keep ONLY one, either the last friday of the month, or the youngest for that month. for example is there was 20030301 20030320 and 20033030 i would want to keep 20033030 and delete the others.

on files younger than 30 days, i want to keep one file per week, unless it is one week old in which case i want to keep everything.

here is what I have:

#!/bin/sh
#
# variables

BACKUPDIR=/blah/blah/
SYSTEMDATE=`date '+%m-%d-%y%n'`

# starts with a inventory of backup files

ls -l $BACKUPDIR > /tmp/fileoutput

# splits files between older and younger than 30 days old

find $BACKUPDIR -mtime +30 > /tmp/morethan30
find $BACKUPDIR -mtime -30 > /tmp/lessthan30

# split full and incremental files less than 30 days old

grep full /tmp/lessthan30 > /tmp/lessthan30full
grep incre /tmp/lessthan30 > /tmp/lessthan30incre


sed 's/\/blah\/blah\///g' /tmp/morethan30 |
sed 's/-blahblah.//g' |
sed 's/-1.tgz//g' |
sed 's/-full//g' |
sort -n > /tmp/morethan30all

sed 's/\/blah\/blah\///g' /tmp/lessthan30full |
sed 's/-blahblah.//g' |
sed 's/-1.tgz//g' |
sed 's/-2.tgz//g' |
sed 's/-3.tgz//g' |
sed 's/-4.tgz//g' |
sed 's/-5.tgz//g' |
sed 's/-full//g' |
sort -n > /tmp/lessthan30full


sed 's/\/blah\/blah\///g' /tmp/lessthan30incre |
sed 's/-blahblah.//g' |
sed 's/-1.tgz//g' |
sed 's/-2.tgz//g' |
sed 's/-3.tgz//g' |
sed 's/-4.tgz//g' |
sed 's/-5.tgz//g' |
sed 's/-incremental//g' |
sort -n > /tmp/lessthan30incre

sed -n '/200[2-9][ ][1-21]/p' /tmp/morethan30all
0
Comment
Question by:alexpa
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 4
8 Comments
 
LVL 8

Expert Comment

by:heskyttberg
ID: 8138341
Hi!

I don't really understand what you are trying here, but if you are doing something like this.

Let's say you backup mon-fri.

In your backup script program, either make two sets, backup sets or scripts.

Make the script that backup mon-thu backup with a one extension and the fri backup with another.

Then do something like this:
find $BACKUPDIR -mtime -7 -name *.mon-thu-ext -exec rm -f {}
find $BACKUPDIR -mtime -35 -name *.fri-ext -exec rm -f {}

I'm not 100% sure about the find syntax, but anyway, count't you do something like the above to save yourself som pain.

If you really would want to do the above, I think you should make a perl script instead since there is so much string parsing going on.

Regards
/Hans - Erik Skyttberg
0
 

Author Comment

by:alexpa
ID: 8147093
sorry if that was not clear. basically, we have an existing backup that makes backups everyday. we want a script that will go through and delete all of the old ones, if it matches a criteria. that criteria is, if any backup is older than 30 days, than all will be deleted, UNLESS it is the one from the last friday of the month. so 200[1-3][1-2][1-24] would be deleted, but 200[1-3][1-2][25-31] would not. i tried using regular expressions, but i could not get a proper match, i was wondering if there were a better way to do it

thanks!
0
 
LVL 8

Expert Comment

by:heskyttberg
ID: 8147821
Hi!

I don't think there is any easy way of doing this.
I also hope this  backup is going onto a raided and tape backuped disk or at east a mirrored set and not onto the same disk you are backing up.

I don't see why you need to keep the last friday of each month ?

Figuring out what date each friday in every month is, that is your problem and also makes it about impossible to do with sed. You really need a perl script to do this.

I still think you have a strange backup strategy. Is the friday backup a FULL backup and that is why you want to keep it ?

Regards
/Hans - Erik Skyttberg
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:alexpa
ID: 8165116
yeah, we have a backup of everyone on tape, but to conserve hard disk space, we want to only keep a single full backup for anything older than a month. Why the last friday? just to have an 'end of the week' copy, but that doesnt matter so much, the last day of the month works as well.

we have daily FULL backups, every day, and we want to keep them all (on disk) unless they are older than 30 days old. so, the script will find the oldest of each month, and than nuke the rest.
0
 
LVL 8

Expert Comment

by:heskyttberg
ID: 8177919
Hi!

Why not just do something like this:
find /my_back_dir -mtime -32 -exec rm -f {}

This will delete all files older than 32 days. If they are on tape anyway, why not even do this.
find /my_back_dir -mtime -7 -exec rm -f {}

You will have 7 days backup on disk, and the rest will be on tape. This means if you ever need to back anything older than 7 days you need to get the tape, but how often does this happen ?

I worked for 5 years with both windows and unix and never had a need to back anything older than 7 days. The oldest restore we ever needed to do was 5 days.

I try to keep things as simple and easy as possible when doing administration.

Your situation might be diffrent, but still try to keep it simple, another solution would be as described above giving friday backups one type of TAG in the filename and all others another and do two find rm thingies. Or even put them in one dir each.

Regards
/Hans - Erik Skyttberg
0
 

Author Comment

by:alexpa
ID: 8181177
Well true that we usually would only want backups from less than 7 days, but management says, based on insurance, that we have 1 full backup, on DISK, for every month.

So the question is the same, I thank you for thinking that I may want to do something different, but what I want to do is still the same. Delete all old backups EXCEPT one that is the last friday of each month.
0
 
LVL 8

Accepted Solution

by:
heskyttberg earned 225 total points
ID: 8185760
Hi!

Ok, do something like this then:

mkdir -p /backup/friday
mkdir /backup/others

Make a cron script that runs after the backup every friday and move thoose backup to /backup/friday.
This could be something like:
find /backup/others -mmin -240 -exec mv {} /backup/friday

Now just do this:
find /backup/others -mtime -7 -exec rm -f {}
find /backup/friday -mtime -38 -exec rm -f {}

This would leavy you with friday backups for last 38 days and for last 7 days for all other days.

In delete script.

Would be easier and less painful then trying to keep all files in same dir and do parsing with sed awk and so on.

I think your only other option would be to make a perl script instead. I'm not great with perl but in perl you could do such things as you want.

Regards
/Hans - Erik Skyttberg
0
 

Author Comment

by:alexpa
ID: 8284974
Well, it looks like I either need to learn perl or just make a seperate cron to copy the friday backups to a different dir. Thanks,

Alex
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction Regular patching is part of a system administrator's tasks. However, many patches require that the system be in single-user mode before they can be installed. A cluster patch in particular can take quite a while to apply if the machine…
Using libpcap/Jpcap to capture and send packets on Solaris version (10/11) Library used: 1.      Libpcap (http://www.tcpdump.org) Version 1.2 2.      Jpcap(http://netresearch.ics.uci.edu/kfujii/Jpcap/doc/index.html) Version 0.6 Prerequisite: 1.      GCC …
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
Suggested Courses
Course of the Month10 days, 7 hours left to enroll

764 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question