shragi
asked on
Perl script to find new files and compress those new files
Hi - My other software dumps files in the one folder (example: folderA)
Now I want a script that consistently watch this folder (May be once in 15min) to find new files and compress those new files and copy it to different folder with timestamp.
Example: FolderA contains the below files
1) testfile1.txt
2) testfile2.txt
FolderB already contains the old files
1) testfile1_09152016_121066. txt
2) testfile2_09152016_121066. txt
3) testfile1_09142016_121066. txt
4) testfile2_09142016_121066. txt
Now I want to check whether testfile1.txt is new file or not by comparing with the latest files from folderB
like comparing testfile1.txt with testfile1_09152016_121066. txt and if it is different then rename testfile1.txt with timestamp and copy the file to folderB after compressing it.
The size of the file is so big, its 1GB min and 4gb Max.
so can't compare the actual content in the file.
So can someone help me how to identify new files and compress them.
Thanks,
Now I want a script that consistently watch this folder (May be once in 15min) to find new files and compress those new files and copy it to different folder with timestamp.
Example: FolderA contains the below files
1) testfile1.txt
2) testfile2.txt
FolderB already contains the old files
1) testfile1_09152016_121066.
2) testfile2_09152016_121066.
3) testfile1_09142016_121066.
4) testfile2_09142016_121066.
Now I want to check whether testfile1.txt is new file or not by comparing with the latest files from folderB
like comparing testfile1.txt with testfile1_09152016_121066.
The size of the file is so big, its 1GB min and 4gb Max.
so can't compare the actual content in the file.
So can someone help me how to identify new files and compress them.
Thanks,
And then
tar --newer date -d'7 days ago' +"%d-%b" -zcf thisweek.tgz
tar --newer date -d'7 days ago' +"%d-%b" -zcf thisweek.tgz
Based on your description the files in folderB are compressed, but the ones in folderA aren't and won't be until after the comparison so how do you want to handle that difference when comparing?
With such large files, the best way to make a comparison would be to generate an md5 checksum of each and compare those checksums. Both files would need to be in the same state (i.e., either compressed or not) when generating the checksum.
With such large files, the best way to make a comparison would be to generate an md5 checksum of each and compare those checksums. Both files would need to be in the same state (i.e., either compressed or not) when generating the checksum.
ASKER
Hi FishMonger - to generate checksum if both needs to be in same state, then we can add third folder for moving compressed files.
FolderA - Drop zone where you can find new files
FolderB - Contains old files with timestamp
FolderC - Contains compressed files of FolderB.
So how do we do the checksum I mean how to write the script for that.
FolderA - Drop zone where you can find new files
FolderB - Contains old files with timestamp
FolderC - Contains compressed files of FolderB.
So how do we do the checksum I mean how to write the script for that.
ASKER CERTIFIED SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
PATH_SRC="/home/celvas/Doc
PATH_DST="/home/celvas/Dow
cd $PATH_SRC
TODAY=$(date -d "$(date +%F)" +%s)
TODAY_TIME=$(date -d "$(date +%T)" +%s)
for f in `ls`;
do
# echo "File -> $f"
MOD_DATE=$(stat -c %y "$f")
MOD_DATE=${MOD_DATE% *}
# echo MOD_DATE: $MOD_DATE
MOD_DATE1=$(date -d "$MOD_DATE" +%s)
# echo MOD_DATE: $MOD_DATE
DIFF_IN_DATE=$[ $MOD_DATE1 - $TODAY ]
DIFF_IN_DATE1=$[ $MOD_DATE1 - $TODAY_TIME ]
#echo DIFF: $DIFF_IN_DATE
#echo DIFF1: $DIFF_IN_DATE1
if [[ ($DIFF_IN_DATE -ge -120) && ($DIFF_IN_DATE1 -le 120) && (DIFF_IN_DATE1 -ge -120) ]]
then
echo File lies in Next Hour = $f
echo MOD_DATE: $MOD_DATE
#mv $PATH_SRC/$f $PATH_DST/$f
fi
done