How do I copy files from one directoy tree to a different folder from a list

I'm a newbee at perl so I will need a pretty complete answer. I have a list of file names that I need to copy from one directory tree on a drive to a folder on a different drive. The list is a text file that I get every night (generated from a different program) that has the files that need to be copied. So I need to read  the list.txt, traverse down a directory tree on drive D, find and copy the files in the list to a floder on drive E. The  system is Windows.
sam5a1Asked:
Who is Participating?
 
FishMongerCommented:
We're far enough along that this is just a minor adjustment.

This adds in the suggested adjustment from mjcoyne (which I modified very slightly).

#!/usr/bin/perl

use strict;
use warnings;
use File::Find;
use File::Copy;

my $SrcRootDir = 'F:/data';
my $DestRootDir = 'T:/transfer';
my $transfer = 'T:/transfer/transfer_list.txt';
my %file;

open(IN, '<', $transfer) or die "Could not open $transfer: $!\n";
while(<IN>) {
      chomp;
      my ($filename) = ($_ =~ /^(.+)\..+$/);
      $file{lc($filename)}++;
}
close(IN);

find(\&wanted, $SrcRootDir);

sub wanted {
    my ($filename) = ($_ =~ /^(.+)\..+$/);
    return unless exists $file{lc($filename)};
    print "Copying $File::Find::name to $DestRootDir\n";
    copy($File::Find::name, "$DestRootDir/$_") or warn "Could not copy $_: $!\n";
}
0
 
Adam314Commented:
What is the format of the list of files?
Something like this should be close:
use File::Copy;

my $SrcRootDir = 'd:/path/to/files';
my $DestRootDir = 'e:/path/to/files';

open(IN, "<list.txt") or die "Could not open list: $!\n";
while(<IN>) {
      chomp;
      copy("$SrcRootDir/$_", "$DestRootDir/$_") or warn "Could not copy $_: $!\n':
}
close(IN);
0
 
sam5a1Author Commented:
The format is just a plain text file, same as you get with notepad.

 Can I use something like this to get the list of files?
 $transfer="T:\\Team\\transfer\\transfer_list.txt";
so that it will be:
open(IN, "< $transfer") or die "Could not open  $transfer: $!\n";
That way I can have the list automatically replaced every night from the other program.

 and (as being new to perl) is there anything between
#!/usr/bin/perl
and
use File::Copy;


0
Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

 
Adam314Commented:
Yes, you can use $transfer as you listed.  If name of the file that has the list of files is going to change, you might want to make it a parameter to the script.
So if your script is called docopy.pl, you could start it like so:
    docopy.pl T:\Team\transfer\transfer_list.txt
You would then get the list like this:
$transfer = $ARGV[0];

You can put other statements before use File::Copy;, but typically the use statements are the first thing in the script.

0
 
sam5a1Author Commented:
OK, here's what I have. This works only if the files are in the root, it will not find the files if they are in a sub directory, (sometimes 3 directories deep, sometimes 2 deep) which most of them are.

#!/usr/bin/perl

use File::Copy;

# $SrcRootDir = 'F:/data';
# $DestRootDir = 'T:/transfer';

$SrcRootDir = "F:\\data\\";
$DestRootDir = "T:\\transfer\\";

$transfer="T:\\transfer\\transfer_list.txt";

open(IN, "<$transfer") or die "Could not open $transfer: $!\n";
while(<IN>) {
      chomp;
      copy("$SrcRootDir/$_", "$DestRootDir/$_") or warn "Could not copy $_: $!\n";
}
close(IN);
0
 
FishMongerCommented:
Does the list of filenames in your txt file include the path or just filename?  If it's just the filename and the file could be in a subdir of F:\data then I'd probably load the filename into a hash and use the File::Find module to locate the files.

This is untested, but should be close to what you need.

#!/usr/bin/perl

use strict;
use warnings;
use File::Find;
use File::Copy;

my $SrcRootDir = 'F:/data';
my $DestRootDir = 'T:/transfer';
my $transfer = 'T:/transfer/transfer_list.txt';
my %file;

open(IN, '<', $transfer) or die "Could not open $transfer: $!\n";
while(<IN>) {
      chomp;
      $file{$_}++;
}
close(IN);

find(\&wanted, $SrcRootDir);

sub wanted {
    return unless exists $file{$_};
    copy("$SrcRootDir/$_", "$DestRootDir/$_") or warn "Could not copy $_: $!\n";
}
0
 
FishMongerCommented:
We probably need to change:
copy("$SrcRootDir/$_", "$DestRootDir/$_") or warn "Could not copy $_: $!\n";

to this:
copy(File::Find::name, "$DestRootDir/$_") or warn "Could not copy $_: $!\n";
0
 
sam5a1Author Commented:
Sorry fishmonger, it still could only find the files if I placed them in the root directory: F:\data
It didn't find the files when I had to go deeper:
ie. when the files were in F:\data\abc\001I got the "Could not copy...no file or directory" message
I didn't understand where you got the varb. name in the last line "copy(File::Find::name,...

I also had to rem out the usage of strict or line 24 returned an error.
0
 
FishMongerCommented:
You should not rem out the use strict line.  That should be in every script you write, without exception, in part because it will point out dumb mistakes like I made.

File::Find::name

should be
$File::Find::name

$File::Find::name is the full path to your file.
$_ (in the wanted sub) is the filename without the path

You'll want to read over the documentation for the File::Find module.
http://search.cpan.org/author/NWCLARK/perl-5.8.8/lib/File/Find.pm

I'm still unsure as to how the filename are listed in the txt file i.e., with or without their path.  If it's the full path, then we'll need to adjust the wanted sub or the loading of the hash.
0
 
Perl_DiverCommented:
Why are you not using a windows batch file instead of perl? What version of windows are you running?
0
 
sam5a1Author Commented:
No, the files are only the names of the files I want, not the full path way. $File::Find::name works. One area I just saw that is a problem, if you can help, the files are missed if there are letters that are properly capped.
The file names run like this:    d for digit and l for letter
dldddd00_l.d01 ie 1a123400_1.T04
can you show me how to make it so the caps don't matter and also so the 00_d.001 doesn't need to be included(some files don't have that exact configuration)
so this 1a/A1234* will work
0
 
sam5a1Author Commented:
Perl_driver: I just thought that Perl would work the best. A batch file might have done just as good except that where I'm at we are using Win2000, XP, Win2003(server) and Win2000(server). The script will be placed on the Win2000 server, trigered by either a XP or Win2000 work station (maybe even ruinning off the workstation) and will run across the rest. I wasn't sure that a batch file would work across all that. I know a Perl script will, so that's why I came here to the land of Perl.
0
 
FishMongerCommented:
>> if you can help, the files are missed if there are letters that are properly capped.

Are you saying that the filenames in the transfer_list.txt file may not be in the same case as the actual files?

See if this version does what you need.

#!/usr/bin/perl

use strict;
use warnings;
use File::Find;
use File::Copy;

my $SrcRootDir = 'F:/data';
my $DestRootDir = 'T:/transfer';
my $transfer = 'T:/transfer/transfer_list.txt';
my %file;

open(IN, '<', $transfer) or die "Could not open $transfer: $!\n";
while(<IN>) {
      chomp;
      $file{lc($_)}++;
}
close(IN);

find(\&wanted, $SrcRootDir);

sub wanted {
    return unless exists $file{lc($_)};
    print "Copying $File::Find::name to $DestRootDir\n";
    copy($File::Find::name, "$DestRootDir/$_") or warn "Could not copy $_: $!\n";
}
0
 
sam5a1Author Commented:
That didn't return anything. I'll not worry about the caps. I'll handle that if it happens. Can you make a suggestion about the wild card to replace the ending of the files?
0
 
FishMongerCommented:
I tested that script based on your description of the requirements and it worked as expected.  If it didn't copy any files when you tested it, then either your transfer_list.txt file doesn't contain the data (filenames) as you've described or I've completely misunderstood your requirements.

Did you read the documentation for the File::Find module as I suggested?  It appears not.

The File::Find module descends the directory tree starting from $SrcRootDir and applies the &wanted sub to each file (and directory).

The 1st line in the sub skips over any file that is not in the %file hash i.e., your transfer_list.txt file.
The 2nd line is just a print statement to see what file is about to be copied.
The 3rd line will either copy the file to the dest dir or print an error message.

Due to the way the File::Find module works, there's no need to accommodate using the wild card matching.  If you want to use that approach, we'd need to use a different and more verbose approach.
0
 
FishMongerCommented:
>> Can you make a suggestion about the wild card to replace the ending of the files?

Does that mean that you want to rename the destination file in addition to copying it to a new location?
0
 
mjcoyneCommented:
If you want to spli the filenames and extensions, you can use a regular expression:

my ($filename, $extension) = ($_ =~ /^(.+)\.(.+)$/);

assuming above that $_ contains the filename.  So, you *might* be able to do something like this (borrowing a piece of FishMonger's code):

open(IN, '<', $transfer) or die "Could not open $transfer: $!\n";
while(<IN>) {
      chomp;
      my ($filename, $extension) = ($_ =~ /^(.+)\.(.+)$/);
      $file{lc($filename)}++;
}

but now the %file hash will contain just the filename (no extension) as keys.  I'm not sure what you're trying to do here -- is this close?

Realize though that the "wanted" sub function will still be dealing with full filenames (with extensions) via the File::Find module, so you'll have to remove the extensions from found files as well, or they won't match the keys of the %file hash.

Is there a possibility that there will be files with the same filename, but different extensions (e.g. myfile.txt and myfile.bak)?  That could get a bit tricky...
0
 
sam5a1Author Commented:
Yes I did read it and found it answered some questions I had.
The script works fine if the file name in transfer_list.txt is the same as the actual file. What I tried was to change the text in the transfer_list.txt. ie. if the file in the  transfer_list.txt is 1A123400_R.T03 I changed it to 1a123400_R.T03. I didn't get any returns so the print statement didn't work because it didn't find 1A123400_R.T03 from 1a123400_R.T03 in the transfer_list.txt. The wild card I was asking about was to replace the .T03 (or what ever it may be) after the _R. The script won't locate the file 1A123400_R.T03 if the transfer_list.txt is 1A123400_R. There is a chance that when the file name was entered in the other report that I'm getting transfer_list.txt from it was entered incorrectly and most problably without the .R03.
0
 
mjcoyneCommented:
BTW, are there files in the source directory that *shouldn't* be copied?  In other words, are we relying on a list of filenames that we don't really need, if we're just copying all files from the source to the target?
0
 
FishMongerCommented:
The script I provided is case insensitive so changing 1A123400_R.T03 to 1a123400_R.T03 would not make any difference and the file would still get copied.  However, if you changed the names is some other manor, such as removing the extension, then it would skip over all files and nothing would be copied.

If you want to strip the extensions, then we'd need to make a modification such as mjcoyne shows and that would also need to be done in the wanted sub for the filename in the $_ var.
0
 
FishMongerCommented:
Another "gatcha" that you might need to lookout for would be if you have 2 files with the same name but are in different sub directories.  If that's a possibility, then only the last one seen will survive in the dest dir i.e., prior ones will be over written.
0
 
sam5a1Author Commented:
Oops, my turn to make a stupi mistake, I only included the lc() one time. The file extension is going to be on of two, either .R03 or .A03. I was hoping that I could use a wild card for the extensions so I wouldn't care which one it is.
 mjcoyne: There isn't any files that shouldn't get copied. The list is being pulled from another program (where a person enters the file name and most usually without the extension) and those are the files I want to copy to a different directory.
Fishmoner: Yes, stripping the extension is the best bet. As I said, I can guarantee that when the file names were entered in the other program the extension was left off of some (most? I dont know) of the file names. Those are what I will be getting when I pull the list. You have answered the original question so if you want I will give you the points for this and open another question for this part if you feel this is going far beyond the original question.
0
 
FishMongerCommented:
I didn't test that last version and I think we should make a slight adjustment to the regex when building the hash.

my ($filename) = ($_ =~ /^(.+)(\..+)?$/);
0
 
mjcoyneCommented:
Since there are no files that won't need to be copied, wouldn't the dircopy() function of File::Copy::Recursive (see http://search.cpan.org/dist/File-Copy-Recursive/Recursive.pm) be a better choice here?  If all the files are being copied, the list of file names is irrelevant and introduces potential inaccuracies that need to be watched for and overcome.
0
 
sam5a1Author Commented:
Had a small detour but I'm back now.

Been looking at your regex, would you mind going over what your doing? I don't quite follow either enough to understand exactly.

0
 
FishMongerCommented:
Assuming that their are other files in the tree besides the ones with the .R03 or .A03 extensions, a better option might be File::Find::Rule.

http://search.cpan.org/~rclamp/File-Find-Rule-0.30/lib/File/Find/Rule.pm
0
 
sam5a1Author Commented:
I can assure you that thar are only 2 file extensions being hunted. If there are other extensions residing there they are very few and they are their because someone made an oops.

The File::Find::Rule does appear to be very versatile. the not and not_ any or grep etc that can be used as a rule for the File::Find makes it quite powerful. I have never seen this File::Find::Rule before, thanks.
0
 
FishMongerCommented:
The regular expression:

(?-imsx:/^(.+)(\..+)?$/)

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  /                        '/'
----------------------------------------------------------------------
  ^                        the beginning of the string
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    .+                       any character except \n (1 or more times
                             (matching the most amount possible))
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  (                        group and capture to \2 (optional
                           (matching the most amount possible)):
----------------------------------------------------------------------
    \.                       '.'
----------------------------------------------------------------------
    .+                       any character except \n (1 or more times
                             (matching the most amount possible))
----------------------------------------------------------------------
  )?                       end of \2 (NOTE: because you're using a
                           quantifier on this capture, only the LAST
                           repetition of the captured pattern will be
                           stored in \2)
----------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string
----------------------------------------------------------------------
  /                        '/'
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------
0
 
mjcoyneCommented:
my ($filename) = ($_ =~ /^(.+)\..+$/);
says to capture everything from the start of the line until the final period before the end of the line and put the results into a variable called $filename.  I had capturing parenthesis around the final .+ as well, to capture the file's extension into a second variable called $extension, but since retaining the extension was not needed, FishMonger modified it by removing that set of parenthesis.

I'm still unclear why you need a rule -- if all the files are being copied and no files are not, just copy all the files with File::Copy::Recursive.  Or use rsync, especially if not all files will have changed when the copy operation occurs...
0
 
sam5a1Author Commented:
mjcoyne: yes if all the files were being copied but only the files that are listed on the _list.txt are being copied. that may be anywhere from 3 to 260+ files out of 83,000+ existing files.
0
 
sam5a1Author Commented:
mjcoyne: Also since the files listed on the _list.txt may or may not have an extension (ie 1A123400_B instead of 1A123400_B.R03) the extension has to be removed for consistancy. Now there exists 1 and only one file 1A123400_B in the directory and that file may have an extension of .R03 or .A03 (this depends on how the file was created) so I needed the extension stripped off of the hash, just incase there was one there from the _list.txt and the $filename had to be able to match either 1A123400_B.R03 or 1A123400_B.A03.
0
 
FishMongerCommented:
On the regex, you'll notice that I add back-in the capturing parens around the extension, but since the extension may or may not be attached to the filename in the txt file, I used the ? quantifier to make that portion of the pattern match optional.
0
 
mjcoyneCommented:
Okay, sorry -- I was confused by your answer above ("mjcoyne: There isn't any files that shouldn't get copied.")...
0
 
sam5a1Author Commented:
mjcoyne: Sorry about that. I meant that there isn't any files listed in the _list.txt that shouldn't get copied and that there are no files off limits in the directories.
0
 
sam5a1Author Commented:
FishMonger: Your script does what I need so thanks for the help and I'll get you the points now.

mjcoyne: Thanks for taking the time to answer questions and help train a Perl newbe.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.