Solved

perl copy the files by avoiding duplicates

Posted on 2014-10-06
4
264 Views
Last Modified: 2014-10-07
Hi,

I am writing the below script to copy the files from one location to another location, but how can avoid copying duplicates.
I know what copy in perl does - if it see any duplicate files it will replace it, but I don't think this is a good idea becoz I generally have atleast 1000 files in a source folder and each day I will get some 20 files max...so when i run my perl program it just needs to copy those 20 files instead of all 1000 .... copying all 1000 files again is a waste of time and resource of server.

so how can i check before hand whether the file already exists in the destination or not.
Also what is the best solution in terms of memory and speed replacing duplicates files or checking for duplicates files.
from what i know checking for duplicates is effective way because of file size that it is going to replace.




my perl script:
my script:
#!/usr/bin/perl

use strict;
use warnings;

my $source = "C:\\test";
my $destination = "C:\\test1";

opendir(DIR,"$source") or die "Cannot open $source\n";
my @files = readdir(DIR);
closedir(DIR);

foreach my $file (@files) {
  next if ($file !~ /\.txt$/i);
  system("copy \"$source\\$file\" \"$destination\"");
 
}
0
Comment
Question by:shragi
  • 2
4 Comments
 
LVL 26

Assisted Solution

by:wilcoxon
wilcoxon earned 350 total points
ID: 40364673
in your foreach loop, add:
next if (-f "$destination/$file");

Open in new window

0
 
LVL 48

Expert Comment

by:Tintin
ID: 40364890
Why not just use rsync?
0
 

Author Comment

by:shragi
ID: 40364898
What is rsync ?
0
 
LVL 48

Accepted Solution

by:
Tintin earned 150 total points
ID: 40365096
rsync is a fantastic tool for copying files (locally or remotely) where it will only copy the changes.

It is very fast and efficient.

See http://rsync.samba.org for details.

For WIndows, there are various ports (including cygwin) available.
0

Featured Post

3 Use Cases for Connected Systems

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, testing some more, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Convert MSI to MSM 1 61
running netsh advfirewall set rule on multiple computers 3 47
Sums of coloumns in html/java 15 76
Recursively Delete Files 5 83
There are many situations when we need to display the data in sorted order. For example: Student details by name or by rank or by total marks etc. If you are working on data driven based projects then you will use sorting techniques very frequently.…
Background Still having to process all these year-end "csv" files received from all these sources (including Government entities), sometimes we have the need to examine the contents due to data error, etc... As a "Unix" shop, our only readily …
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question