Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

Efficent perl code - 3 dimensional array???

Posted on 2006-04-25
5
Medium Priority
?
216 Views
Last Modified: 2012-08-14
I have written some real perl noob code, which performs a file open and close for every line of a CSV file.  You can imagine how long this takes!

An extract from the (input) CSV file is as follows:

"1","SY","CASTELFRANCO MODENA","Italy","sdfvsdvb"
"2","SY","OUDENBURG","Belgium","zfbnmsrtn"
"3","SY","richmond","United Kingdom","ahnfgbxzvb"
"4","SY","saint martin","France","asdgddfh"
"5","SY","Sheffield","United Kingdom","dfafghdhrtnad"
"6","SY","Sigmaringendorf","Germany","advbrth"
"7","SY","Torino TO","Italy","adfhnfgjdt"

What I end up with is a around 50 new CSV files (one for each country), and they are filled with the lines that are relevent to that country. i.e. #3 #5 from the extract above would be put in a file called United Kingdom.csv.  So it all works perfectly....except it takes about 1 year to complete!!!

...Heres my noob code:

{local @ARGV=<C:/reports/incidents/reports/test/*.csv>; #pass's all contents of all .csv files in the directory as parameters
    while( <> ){ #while we have data
        chomp;
      my @line=(split/\"/); #split on the csv seperator
              open OUTPUTFILE, ">> C:/reports/incidents/reports/test/$line[7].csv";
                                      #append/create a file with the same name as the country from the imported csv files
      print OUTPUTFILE join'"',@line,"\n"; #print the line
      close OUTPUTFILE; #close the file
    }
}


So my question is, who can make this more efficent???....I thought about using 3 dimensional hashed arrays to store the data for each country in memory and then performing a single write operation, but this pushs my perl knowledge too far!

300 points to first person to post the working modified code.

Thanks,

R
0
Comment
Question by:trickys77
  • 3
  • 2
5 Comments
 
LVL 6

Expert Comment

by:tone28
ID: 16534292


{local @ARGV=<C:/reports/incidents/reports/test/*.csv>; #pass's all contents of all .csv files in the directory as parameters
    %hash = ();
    while( <> ){ #while we have data
         chomp;
         my @line= (split/\"/); #split on the csv seperator
         push( @{ $hash{$line[7]} },$_);
    }
    foreach my $name(keys %hash) {
        open(FH,">>C:/reports/incidents/reports/test/$name.csv");
        print FH "$_\n" for @{ $hash{$name} };
        close(FH); #close the file
   }
}
0
 
LVL 6

Expert Comment

by:tone28
ID: 16534313
The one thing that I don't like about the above script is you are reading in csv from the same directory your putting your output csv's. You shoud change the directories but i tried to keep it as similar to yours as possible.

0
 

Author Comment

by:trickys77
ID: 16534440
haha yeah I had already realised that too and changed it in my script....Thanks for the heads up!

The script works perfectly and runs in a nippy 4 seconds!!!

Excellent coding....

Any change I can squeeze out some comments so that I can understand it before I award the points?

Don't worry if you don't have time tho, I will award anyway.

Thanks, and great job. :o)
0
 
LVL 6

Accepted Solution

by:
tone28 earned 1200 total points
ID: 16534569
{local @ARGV=<C:/reports/incidents/reports/test/*.csv>; #pass's all contents of all .csv files in the directory as parameters
    %hash = (); # empty hash - good practice
    while( <> ){ #while we have data
         chomp;
         my @line= (split/\"/); #split on the csv seperator to get the country name
         
         push( @{ $hash{$line[7]} },$_); # take the $_ info and push it into an anonymous array of a hash thats key is the name of the country.
    }
    foreach my $name(keys %hash) {
        open(FH,">>C:/reports/incidents/reports/test/$name.csv");
        print FH "$_\n" for @{ $hash{$name} }; # This is just a one line way of doing a foreach my $element(@array)
        close(FH); #close the file
   }
}

# Hope that helps. Let me know if you need any further detail.


0
 

Author Comment

by:trickys77
ID: 16534812
spot on....cheers :oD
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I've just discovered very important differences between Windows an Unix formats in Perl,at least 5.xx.. MOST IMPORTANT: Use Unix file format while saving Your script. otherwise it will have ^M s or smth likely weird in the EOL, Then DO NOT use m…
There are many situations when we need to display the data in sorted order. For example: Student details by name or by rank or by total marks etc. If you are working on data driven based projects then you will use sorting techniques very frequently.…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans
Suggested Courses

564 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question