Solved

Complex batch renaming/moving of files using perl

Posted on 2009-07-01
20
849 Views
Last Modified: 2012-05-07
Having been given the task of renaming thousands of images i set out to find a simple solution and came across the perl script below it works very well but wouldn't mind help in changing it, scripting and regex aren't my strong point but the script below does do a good job of renaming files in a directory.
OK, to start with my client is supplying us with a .csv file containing the oldproductID and newproductID
the file names will look something like this 10000.eps, 10000_01.eps, 10000_02.eps and will need to change to 220000.eps, 220000_01.eps, 220000_02.eps and so on only the first 4-6 digits of the file need to change the underscore is used only when there are multiple shots of the same product and isn't always present, the files are not all in the same directory but broken into images/10000_10100 and so on, also using this script it gives no indication that the name has been changed. So what i need is
A script that will use the .csv file supplied look through a directory and its subdirectories for corresponding files rename them and move the now renamed file to a renamed directory for sorting later.
Thanks
#!/usr/bin/env perl

use warnings;

use strict;

# first slurp in the dictionary

my %dict;

while(<>) {

$dict{$1} = $2 if(m/^(\d+),(.*)$/);

}

# now process every file in the current directory

opendir DH, "." or die "Cannot open current directory for processing, check permissions\n";

foreach my $file(readdir DH) {

# check if filename adheres to pattern

if($file =~ m/^(\d+)\s/) {

# check if there's a translation for it

if(exists($dict{$1})) {

# rename the file

my ($new, $id, $name) = ($file, $1, $dict{$1});

$new =~ s/^($id)/$name/;

rename($file, $new);

print "Renamed $file to $new\n";

} else {

print "Could not rename $file as it's not mapped in the dictionary\n";

}

} else {

print "Could not rename $file as it doesn't start with a product ID followed by a space\n";

}

}

Open in new window

0
Comment
Question by:mce-man-it
  • 11
  • 7
  • 2
20 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 24753889
are you saying you want to change
rename($file, $new);
to
rename($file, "/renamed directory/$new");
0
 

Author Comment

by:mce-man-it
ID: 24755335
that's part of it solved
thanks


ozo

are you saying you want to change
rename($file, $new);
to
rename($file, "/renamed directory/$new");
0
 
LVL 84

Expert Comment

by:ozo
ID: 24756764
then what do you want changed?
do you just want to apply it to subdirectories with
   use File::Find;
   find(sub{
$file=$_;
if($file =~ m/^(\d+)\s/) {
# check if there's a translation for it
if(exists($dict{$1})) {
# rename the file
my ($new, $id, $name) = ($file, $1, $dict{$1});
$new =~ s/^($id)/$name/;
rename($file, $new);
print "Renamed $file to $new\n";
} else {
print "Could not rename $file as it's not mapped in the dictionary\n";
}
} else {
print "Could not rename $file as it doesn't start with a product ID followed by a space\n";
}
}
},".");
0
 

Author Comment

by:mce-man-it
ID: 24783670
Sorry for the slow reply to you post ozo, the answer is yes i do need the script to look through subdirectories and where necessary change/rename the file and put into a holding folder.

Heres a quick overview

I get given a comma delimited file with oldname,newname and so on
I run a script that ingests this file looks through our numerous directories finds/changes file names move to holding area

the script i posted was found on another forum and only works in the directory its run and doesn't look through subdirectories here's how i got it to run and ingest the supplied file
./scriptname.pl commadelimitedfile

the other bit that i had difficulty with was the regular expression   m/^(\d+)\s/    this looks for a file that starts with digits then a space I need it to search for files that could be 12345.eps or 12345_01.eps but only changes the 12345 of a file name

thanks hope this is a bit clearer than my lengthy first post.
0
 
LVL 39

Accepted Solution

by:
Adam314 earned 400 total points
ID: 24807838

#!/usr/bin/perl

use strict;

use warnings;

use Data::Dumper;

use File::Find;
 

##### Set these as needed

my $csv_filename = 'filename.csv';

my $holding_directory = '/path/to/holding';
 
 

##### Read CSV to get old-2-new conversion

my %old2new;

open(my $csv, '<filename.csv') or die "Could not open csv: $!\n";

while(<$csv>) {

	chomp;

	my @f=split(/\s*,\s*/, $_, 2);

	$old2new{$f[0]} = $f[1];

}

close($csv);
 

##### Search for files, rename, move to holding directory for later sorting

find(\&Found, '.');
 

sub Found {

	return unless -f $File::Find::name;  #Skip non-files

	return unless /^(\d+)(.*)$/;         #Skip files that don't begin with a number

	my ($oldnum, $ext) = ($1, $2);       #Get number, and everything else (eg: "_1.eps")

	return unless $old2new{$oldnum};     #Skip files that don't have a mapping from old number to new number

	my $newpathname="$holding_directory/$old2new{$oldnum}$ext";

	rename($File::Find::name, $newpathname)

	  or die "Could not rename $File::Find::name to $newpathname: $!\n";

}

Open in new window

0
 

Author Comment

by:mce-man-it
ID: 24812745
Adam314 this works great, but when run in a directory it only renames/moves files contained in that directory and doesn't search sub directories?

much appreciated by the way
0
 
LVL 39

Expert Comment

by:Adam314
ID: 24815986
It should search all sub directories also....

Try this small change, and run again.  Does it list subdirectories?
...

sub Found {

    print "Checking '$File::Find::name'\n";    #NEW LINE

    return unless -f $File::Find::name;  #Skip non-files

    return unless /^(\d+)(.*)$/;         #Skip files that don't begin with a number

...

Open in new window

0
 

Author Comment

by:mce-man-it
ID: 24821174
looking at the terminal print out (below) it does list my test subdirectory   /files_renaming   and it also lists the test files contained within

./rename_files.pl filename
Checking '.'
Checking './.HSancillary'
Checking './rename_files.pl'
Checking './rename_files_old.pl'
Checking './filename'
Checking './10481_02.eps'
Checking './files_renaming'
Checking './files_renaming/13122.eps'
Checking './files_renaming/10025.eps'
Checking './files_renaming/.HSancillary'
Checking './files_renaming/10025_01.eps'
Checking './files_renaming/10008_03.eps'
Checking './files_renaming/10025_02.eps'
Checking './files_renaming/10008_02.eps'
Checking './files_renaming/10008_01.eps'
Checking './files_renaming/10481_02.eps'
Checking './files_renaming/.DS_Store'
0
 
LVL 39

Assisted Solution

by:Adam314
Adam314 earned 400 total points
ID: 24823969
For debugging, try this:
sub Found {

    print "Checking '$File::Find::name'\n";

    print("    skip: non-file\n"),return unless -f $File::Find::name;  #Skip non-files

    print("    skip: no number\n"),return unless /^(\d+)(.*)$/;         #Skip files that don't begin with a number

    my ($oldnum, $ext) = ($1, $2);       #Get number, and everything else (eg: "_1.eps")

    print("    skip: no mapping\n"),return unless $old2new{$oldnum};     #Skip files that don't have a mapping from old number to new number

    my $newpathname="$holding_directory/$old2new{$oldnum}$ext";

    print "    rename to '$newpath'\n"),

    rename($File::Find::name, $newpathname)

      or die "Could not rename $File::Find::name to $newpathname: $!\n";

}

Open in new window

0
 

Author Comment

by:mce-man-it
ID: 24824379
from terminal any help

./rename_files.pl filename
Global symbol "$newpath" requires explicit package name at ./rename_files.pl line 32.
syntax error at ./rename_files.pl line 32, near ""    rename to '$newpath'\n")"
Execution of ./rename_files.pl aborted due to compilation errors.
0
What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

 

Author Comment

by:mce-man-it
ID: 24824453
just in case i've done something wrong here's the code
#!/usr/bin/perl

use strict;

use warnings;

use Data::Dumper;

use File::Find;

 

##### Set these as needed

my $csv_filename = 'filename';

my $holding_directory = '/RAID/Test_Volume/renamed_directory';

 

 

##### Read CSV to get old-2-new conversion

my %old2new;

open(my $csv, '<filename') or die "Could not open csv: $!\n";

while(<$csv>) {

        chomp;

        my @f=split(/\s*,\s*/, $_, 2);

        $old2new{$f[0]} = $f[1];

}

close($csv);

 

##### Search for files, rename, move to holding directory for later sorting

find(\&Found, '.');

 

sub Found {

    print "Checking '$File::Find::name'\n";

    print("    skip: non-file\n"),return unless -f $File::Find::name;  #Skip non-files

    print("    skip: no number\n"),return unless /^(\d+)(.*)$/;         #Skip files that don't begin with a number

    my ($oldnum, $ext) = ($1, $2);       #Get number, and everything else (eg: "_1.eps")

    print("    skip: no mapping\n"),return unless $old2new{$oldnum};     #Skip files that don't have a mapping from old number to new number

    my $newpathname="$holding_directory/$old2new{$oldnum}$ext";

    print "    rename to '$newpath'\n"),

    rename($File::Find::name, $newpathname)

      or die "Could not rename $File::Find::name to $newpathname: $!\n";

}

Open in new window

0
 
LVL 39

Expert Comment

by:Adam314
ID: 24827660
Had a typo.  Line 32 should be:
    print "    rename to 'newpathname'\n";
0
 

Author Comment

by:mce-man-it
ID: 24837862
terminal print out with amended line 32, it looks as if the path is added as part of the file name when in subdirectories, so the files are skipped?

 ./rename_files.pl filename
Checking '.'
    skip: non-file
Checking './.HSancillary'
    skip: no number
Checking './rename_files.pl'
    skip: no number
Checking './rename_files_old.pl'
    skip: no number
Checking './filename'
    skip: no number
Checking './10481_02.eps'
    skip: no mapping
Checking './.DS_Store'
    skip: no number
Checking './.HSResource'
    skip: non-file
Checking './files_renaming'
    skip: non-file
Checking './files_renaming/13122.eps'
    skip: non-file
Checking './files_renaming/10025.eps'
    skip: non-file
Checking './files_renaming/.HSancillary'
    skip: non-file
Checking './files_renaming/10025_01.eps'
    skip: non-file
Checking './files_renaming/10008_03.eps'
    skip: non-file
Checking './files_renaming/10025_02.eps'
    skip: non-file
Checking './files_renaming/10008_02.eps'
    skip: non-file
Checking './files_renaming/10008_01.eps'
    skip: non-file
Checking './files_renaming/10481_02.eps'
    skip: non-file
Checking './files_renaming/.DS_Store'
    skip: non-file
Checking './files_renaming/.HSResource'
    skip: non-file
0
 
LVL 39

Expert Comment

by:Adam314
ID: 24843166
It looks like most of the files you would want renamed are being skipped because they are not a "plain file".  Are these files symlinks?  Or pipes?
If you want to do the renaming for all these files, you could change line 27 to
print("    skip: directory\n"),return if -d $File::Find::name;  #Skip directories

Open in new window

0
 

Author Comment

by:mce-man-it
ID: 24848189
no they're not symlinks or pipes and if i move the same files up a directory were the script is and run it, the files are renamed and moved?
0
 
LVL 39

Expert Comment

by:Adam314
ID: 24849994
Does it work if you use the update in post 24843166?
0
 

Author Comment

by:mce-man-it
ID: 24851070
tried the new line in post 24843166 but still a no go it says    Checking './files/10008_02.eps'    skip: non-file       in the terminal with each file, but if i move the files up a directory it works great.
0
 
LVL 39

Assisted Solution

by:Adam314
Adam314 earned 400 total points
ID: 24851327
Okay - the problem is because the top directory being specified is ".", so all the files found begin with "./".  This works for the current directory.  But the find routine changes the current directory when it is searching (the current directory is whatever directory is being searched), so the "./" refers to the new current directory, and the path is no longer correct.

To fix, there are 2 options:
1) Change the directory name from "." to an absolute directory name
    If you want it to always be the current directory, you could use $ENV{PWD},like so:
        find(\&Found, $ENV{PWD});
    or
        use Cwd;     #At the top of your script
        find(\&Found, getcwd);

2) Change the script to deal with this.  This will cause the script to run a little bit slower.  If this is what you need though, let me know.
0
 

Author Comment

by:mce-man-it
ID: 24851520
amended the script as you suggested and it works like a dream
        use Cwd;     #At the top of your script
        find(\&Found, getcwd);

thanks for all your help i've attached the final code just in case anybody else finds it useful
#!/usr/bin/perl

use strict;

use warnings;

use Data::Dumper;

use File::Find;

use Cwd;
 
 

##### Set these as needed

my $csv_filename = 'filename';

my $holding_directory = '/RAID/Test_Volume/renamed_directory';

 

 

##### Read CSV to get old-2-new conversion

my %old2new;

open(my $csv, '<filename') or die "Could not open csv: $!\n";

while(<$csv>) {

        chomp;

        my @f=split(/\s*,\s*/, $_, 2);

        $old2new{$f[0]} = $f[1];

}

close($csv);

 

##### Search for files, rename, move to holding directory for later sorting

find(\&Found, getcwd);

 

sub Found {

    print "Checking '$File::Find::name'\n";

    print("    skip: non-file\n"),return unless -f $File::Find::name;  #Skip non-files

    print("    skip: no number\n"),return unless /^(\d+)(.*)$/;         #Skip files that don't begin with a number

    my ($oldnum, $ext) = ($1, $2);       #Get number, and everything else (eg: "_1.eps")

    print("    skip: no mapping\n"),return unless $old2new{$oldnum};     #Skip files that don't have a mapping from old number to new number

    my $newpathname="$holding_directory/$old2new{$oldnum}$ext";

    print "    rename to 'newpathname'\n";

    rename($File::Find::name, $newpathname)

      or die "Could not rename $File::Find::name to $newpathname: $!\n";

}

Open in new window

0
 

Author Closing Comment

by:mce-man-it
ID: 31598715
thanks again for your time and effort
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

A year or so back I was asked to have a play with MongoDB; within half an hour I had downloaded (http://www.mongodb.org/downloads),  installed and started the daemon, and had a console window open. After an hour or two of playing at the command …
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now