Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Perl script to find new files and compress those new files

Posted on 2016-09-14
5
Medium Priority
?
192 Views
Last Modified: 2016-09-26
Hi - My other software dumps files in the one folder (example: folderA)
Now I want a script that consistently watch this folder (May be once in 15min) to find new files and compress those new files and copy it to different folder with timestamp.

Example: FolderA contains the below files
1) testfile1.txt
2) testfile2.txt

FolderB already contains the old files
1) testfile1_09152016_121066.txt
2) testfile2_09152016_121066.txt
3) testfile1_09142016_121066.txt
4) testfile2_09142016_121066.txt

Now I want to check whether testfile1.txt is new file or not by comparing with the latest files from folderB
like comparing testfile1.txt with testfile1_09152016_121066.txt and if it is different then rename testfile1.txt with timestamp and copy the file to folderB after compressing it.
The size of the file is so big, its 1GB min and 4gb Max.
so can't compare the actual content in the file.

So can someone help me how to identify new files and compress them.

Thanks,
0
Comment
Question by:shragi
  • 2
  • 2
5 Comments
 
LVL 7

Expert Comment

by:D Patel
ID: 41799368
You can use like this:

PATH_SRC="/home/celvas/Documents/Imp_Task/"
PATH_DST="/home/celvas/Downloads/zeeshan/"

cd $PATH_SRC
TODAY=$(date  -d "$(date +%F)" +%s)
TODAY_TIME=$(date -d "$(date +%T)" +%s)


for f in `ls`;
do
#       echo "File -> $f"
        MOD_DATE=$(stat -c %y "$f")
        MOD_DATE=${MOD_DATE% *}
#       echo MOD_DATE: $MOD_DATE
        MOD_DATE1=$(date -d "$MOD_DATE" +%s)
#       echo MOD_DATE: $MOD_DATE

DIFF_IN_DATE=$[ $MOD_DATE1 - $TODAY ]
DIFF_IN_DATE1=$[ $MOD_DATE1 - $TODAY_TIME ]
#echo DIFF: $DIFF_IN_DATE
#echo DIFF1: $DIFF_IN_DATE1
if [[ ($DIFF_IN_DATE -ge -120) && ($DIFF_IN_DATE1 -le 120) && (DIFF_IN_DATE1 -ge -120) ]]
then
echo File lies in Next Hour = $f
echo MOD_DATE: $MOD_DATE

#mv $PATH_SRC/$f  $PATH_DST/$f
fi
done
0
 
LVL 7

Expert Comment

by:D Patel
ID: 41799371
And then

tar --newer date -d'7 days ago' +"%d-%b" -zcf thisweek.tgz
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 41799822
Based on your description the files in folderB are compressed, but the ones in folderA aren't and won't be until after the comparison so how do you want to handle that difference when comparing?

With such large files, the best way to make a comparison would be to generate an md5 checksum of each and compare those checksums.  Both files would need to be in the same state (i.e., either compressed or not) when generating the checksum.
0
 

Author Comment

by:shragi
ID: 41799933
Hi FishMonger - to generate checksum if both needs to be in same state, then we can add third folder for moving compressed files.
FolderA - Drop zone where you can find new files
FolderB - Contains old files with timestamp
FolderC - Contains compressed files of FolderB.

So how do we do the checksum I mean how to write the script for that.
0
 
LVL 28

Accepted Solution

by:
FishMonger earned 2000 total points
ID: 41799972
Another factor which you need to take into account is whether or not the file in folderA is still being written when you want to compare it with the file(s) in folderB.  You don't want to try to compress and copy it while it's still being written.

Why keep multiple copies of the same file, especially given their size?  It's not real clear, but it sounds like your looking at having at least 2 maybe even 3 copies of each file (one in each folder).

A quick google search will give you a number of example resources on how to genertate the checksum.  Here's one

This one goes into a little more detail.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A year or so back I was asked to have a play with MongoDB; within half an hour I had downloaded (http://www.mongodb.org/downloads),  installed and started the daemon, and had a console window open. After an hour or two of playing at the command …
Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
Learn the basics of if, else, and elif statements in Python 2.7. Use "if" statements to test a specified condition.: The structure of an if statement is as follows: (CODE) Use "else" statements to allow the execution of an alternative, if the …
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Suggested Courses

971 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question