Perl speed up looping


I have a large loop that is taking some time to execute and need to reduce increase the speed.

Main loop {
 open file A
 split and breakup the variables
 sub program {
     open file B
       split and breakup the variables
       if variable from Main loop = variable from sub do  another loop
           open file C
               if variable from current loop = variable from file B  do another loop
                   open file D (same file as A)
                     if variable from current loop = variable from file c
                         get data and store
                   close file D
           close file C
     close file B
  sub program {
    simular looping as above but file B and file C are switched.
  close file A    
  print data

The loops are working but the files can be large in size

I changed the foreach loops to while loops but not much change in speed.

Not sure if Greps will speed it up to search in the files instead of looping with while loops


Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

We need to see the actual code.

Have you profiled the script to see where it's spending most of its time?

Devel::NYTProf - Powerful fast feature-rich Perl source code profiler

Are you actually defining the subs inside a loop?  If so, that's a mistake.
If you're opening/reopening files inside a loop as your pseudo implies , then that's one reason it would run slow.
mikeysmailbox1Author Commented:
Hi FishMonger

Here is the sub  I and doing and yes I am opening and closing the file.
Should I use something else?
I originally used arrays but it was not any faster.

sub outconditions {

  my @outparms = @_;
  while ($out = <SQLDATAEM_OUTCOND>) {
    $out =~ s/^\s+//;
    $out =~ s/\s+$//;
    ($P_OUTTABID,$P_OUTJOBID,$P_OUTCON,$P_ODATO,$P_SIGN) = split (/\|/,$out);
    if(!$P_OUTTABID) {
    } elsif(($outparms[0] eq $P_OUTTABID) && ($outparms[1] eq $P_OUTJOBID)) {
      # skip header
      if($P_OUTCON =~ /condition/ and $P_ODATO =~ /odate/ and $P_SIGN =~ /and_or/) {
      push(@OUTLIST,"$P_OUTCON   $P_ODATO   $P_SIGN");
      open SQLDATAEM_INCOND, "/tmp/estee.SQLDATAEM_INCOND.txt";
              $GET_SUCC_INC_LIST =~ s/^\s+//;
          $GET_SUCC_INC_LIST =~ s/\s+$//;
           if("${P_OUTCON}" eq "${GET_SUCC_P_INCON}") {
             open SQLDATAEM_JOBDATA_OUT, "/tmp/estee.SQLDATAEM_JOBDATA.txt";
             while ($GET_SUCC_GET_JOB = <SQLDATAEM_JOBDATA_OUT>) {
                   if($GET_SUCC_P_INTABID eq $GET_SUCC_TABLE_ID and $GET_SUCC_P_INJOBID eq $GET_SUCC_JOB_ID and $P_SIGN ne "\-") {
                   } else {
           $GET_SUCC_INC_LIST = ();
           $GET_SUCC_TABLE_ID = ();
           $GET_SUCC_JOB_ID = ();
           $GET_SUCC_SCHED_TABLE = ();
           $GET_SUCC_PARENT_TABLE = ();
           $GET_SUCC_APPLICATION = ();
           $GET_SUCC_GROUP_NAME = ();
           $GET_SUCC_MEMNAME = ();
           $GET_SUCC_JOB_NAME = ();
           $GET_SUCC_P_INTABID = ();
           $GET_SUCC_P_INJOBID = ();
           $GET_SUCC_P_INCON = ();
           $GET_SUCC_P_ODATI = ();
           $GET_SUCC_P_SIGNI = ();
           $GET_SUCC_GET_JOB = ();
       $P_OUTTABID = ();
       $P_OUTJOBID = ();
       $P_OUTCON = ();
       $P_ODATO = ();
       $P_SIGN = ();
I see lots of problems with that sub.  Some definitely are causing your script to be very inefficient and others are poor coding practices that make the script harder to read and maintain.

In the vast majority of cases the slowest part of a program is its disk I/O and you script is reopening and re-parsing the same 3 files over and over and over again.  That's very wasteful/inefficient and I'm sure is the main reason why your script is slow.

You haven't given any info about the contents of those files but you do say they can be large, so Lets assume that first file has 1,000 lines (probably a very low estimate) that meet the criteria to reach the point where you open the second file.  That file will be opened and re-parsed 1,000 times.  If that file has 1,000 lines that  meet the criteria to reach the point where you open the third file, then that file is reopened and re-parsed 1,000,000 times.  Since this sub is being called from within an outer loop, you need to multiple those numbers by that factor.

 How many billions of times do you want to open and parse that 3rd file?

Profiling your script with the Devel::NYTProf module I mentioned will give the real stat numbers.
Since I don't have any clue about to the contents of those files, I can't say if what I'm about to suggest will help much, but it might.

You could easily and efficiently load those files into a database (temporary or not) and use sql statements to filter/extract different combinations of data based on your needed criteria.  Doing that should/will be far more efficient than the nested looping and parsing.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.