Link to home
Start Free TrialLog in
Avatar of StephenMcGowan
StephenMcGowan

asked on

Perl script to handle CSV files

Hi.

I'm currently trying to write a perl script.
I have a folder containing CSV files contained in the following directory:
"C:\Users\Stephen\Desktop\Database_Design"

The files are currently named in the format: "20130730_p12_A1.csv" (csvfiles.jpg attached).

Each CSV file contains two columns: "mass" and "intensity".
Firstly, what I'd like to do, is to take the full name of each file (201307030_p12_A1) and add it to the next adjacent column for each of these files, under the heading "filename". This filename will then be "copied down" to the last entry within the table (filename-column.jpg attached).




Once this column has been added to each CSV file, the next step will be to modify the filename for each CSV file.

First of all, the date and "p12" should be removed from each filename, so:

"20130730_p12_A1.csv"

would simply read:

A1.csv
....

A1.csv
A2.csv
A3.csv
etc..

Secondly, a text file (mapcodelist.txt) will be used to add information into the filename.

For naming convention, I'd like to change any full stops (".")s within mapcodelist to underscores ("_")s.
I'd then like to tie up this mapcodelist information with the CSV files.

For example:

Using mapcodelist.txt (attached), A1.csv would become:

"A1_GO1_3.csv"

A2.csv becoming A2_G0_808.csv
A3.csv becoming A3_OUM19363.csv

and so on...

If anybody could possibly help me out with this, I'd very much appreciate it.

Thanks

Stephen.
csvfiles.jpg
filename-column.jpg
mapcodelist.txt
Avatar of ozo
ozo
Flag of United States of America image

#!/usr/bin/perl
open M,"<mapcodelist.txt" or die "mapcodelist.txt $!";
while( <M> ){
  my($k,$v)=split;
  $v=~s/\./_/g;
  $m{$k}=$v;
}
close M;
@ARGV=<C:/Users/Stephen/Desktop/Database_Design/20130730_p12_*.csv>;
$^I=".bak";
while( <> ){
    s/$/,$ARGV/;
   print;
}
for( <C:/Users/Stephen/Desktop/Database_Design/20130730_p12_*.csv> ){
  ($r=$_) =~ s/\w+_(\w+)(?=\.csv)/$1_$m{$1}/;
  rename $_,$r or warn " rename $_,$r  $!";
}
Avatar of StephenMcGowan
StephenMcGowan

ASKER

Thanks ozo,

The script worked great. The only thing I've noticed is that the table header "filename" isn't created and that the full directory pathway is transferred to the spreadsheet, where I would only like the date, the 'p-number' (p12) and the 'A-number' (A5).
(see the two jpgs attached)

Also, I've noticed that in your script, you've defined the date within the directory pathway:

@ARGV=<C:/Users/Stephen/Desktop/Database_Design/20130730_p12_*.csv>;

This filename date will vary and so should have the ability to change. Would it be possible to simply look for all csv files within a folder regardless of the "20130730_p12_" name?

Thanks again,

Stephen.
table1.jpg
table2.jpg
ASKER CERTIFIED SOLUTION
Avatar of ozo
ozo
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hi ozo,

Thanks a lot for getting back to me. I've submitted the question and awarded the points to you. :)

Just one last request with this script:

How do I go about removing the ".csv"s from this newly created column? I also want to give this column a header: "filename" (table1.jpg)

I also want to create a new column with the heading: "samplename".. this takes the second part of the name which was used in the spreadsheet renaming, and copies this down throughout the column. So for an example A1_GO1_3.csv would take the GO1_3 add this to the samplename column and copy it down (table2.jpg)

Thanks,

Stephen.
table1.jpg
table2.jpg