Solved

perl copy the files by avoiding duplicates

Posted on 2014-10-06
4
262 Views
Last Modified: 2014-10-07
Hi,

I am writing the below script to copy the files from one location to another location, but how can avoid copying duplicates.
I know what copy in perl does - if it see any duplicate files it will replace it, but I don't think this is a good idea becoz I generally have atleast 1000 files in a source folder and each day I will get some 20 files max...so when i run my perl program it just needs to copy those 20 files instead of all 1000 .... copying all 1000 files again is a waste of time and resource of server.

so how can i check before hand whether the file already exists in the destination or not.
Also what is the best solution in terms of memory and speed replacing duplicates files or checking for duplicates files.
from what i know checking for duplicates is effective way because of file size that it is going to replace.




my perl script:
my script:
#!/usr/bin/perl

use strict;
use warnings;

my $source = "C:\\test";
my $destination = "C:\\test1";

opendir(DIR,"$source") or die "Cannot open $source\n";
my @files = readdir(DIR);
closedir(DIR);

foreach my $file (@files) {
  next if ($file !~ /\.txt$/i);
  system("copy \"$source\\$file\" \"$destination\"");
 
}
0
Comment
Question by:shragi
  • 2
4 Comments
 
LVL 26

Assisted Solution

by:wilcoxon
wilcoxon earned 350 total points
ID: 40364673
in your foreach loop, add:
next if (-f "$destination/$file");

Open in new window

0
 
LVL 48

Expert Comment

by:Tintin
ID: 40364890
Why not just use rsync?
0
 

Author Comment

by:shragi
ID: 40364898
What is rsync ?
0
 
LVL 48

Accepted Solution

by:
Tintin earned 150 total points
ID: 40365096
rsync is a fantastic tool for copying files (locally or remotely) where it will only copy the changes.

It is very fast and efficient.

See http://rsync.samba.org for details.

For WIndows, there are various ports (including cygwin) available.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I have been pestered over the years to produce and distribute regular data extracts, and often the request have explicitly requested the data be emailed as an Excel attachement; specifically Excel, as it appears: CSV files confuse (no Red or Green h…
Background Still having to process all these year-end "csv" files received from all these sources (including Government entities), sometimes we have the need to examine the contents due to data error, etc... As a "Unix" shop, our only readily …
Learn the basics of if, else, and elif statements in Python 2.7. Use "if" statements to test a specified condition.: The structure of an if statement is as follows: (CODE) Use "else" statements to allow the execution of an alternative, if the …
Video by: Mark
This lesson goes over how to construct ordered and unordered lists and how to create hyperlinks.

910 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now