Read large data file via perl in chunks

I have file with more 700,000 records, I want to load 100,000 into a table process them and get next 100,000 until I am done. How would I accomplish file reading chunks via perl (currently I load all the records at once).

Thanks
khanzada19Asked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Adam314Commented:
Are the records fixed length?  If so, you could use the read function with
    (record size) * (number of records you want to read)

If they are terminated by something (such as newline), set $/ to the end-of-record character, then read 1 record at a time and save it in memory until you have all the records you want.
0
khanzada19Author Commented:
records are seperated by ;, could you give me code example. Thanks
0
Adam314Commented:

open(my $in, "<your_file.txt") or die "could not open: $!\n";
local $/=";";  #If you meant ; then newline, use ";\n"
 
my @records;
while(<$in>) {
  if($#records<100_000) {push @records, $_;}
  else {
    #process your records here
    @records=();  #then clear for the next 100_000
  }
}
close($in);

Open in new window

0
CompTIA Security+

Learn the essential functions of CompTIA Security+, which establishes the core knowledge required of any cybersecurity role and leads professionals into intermediate-level cybersecurity jobs.

khanzada19Author Commented:
I am doing following but it is read 1 record at time instead of 5000. what am I doing wrong?

  $ret = open(E_P_FILE, "< $PathFileName");
  local $/=";\n";

my @records;
      while ( $line = <EXP_IMP_FILE> )

       if ($#records < 5000){
         print"\n\n NumberOfLineImport := $NumberOfLineImport\n\n";
       }
       else{
         print"\n\n ELSE NumberOfLineImport := $NumberOfLineImport\n\n";
             @records=();
       }
 }#while
0
Adam314Commented:

$ret = open(E_P_FILE, "< $PathFileName");
local $/=";\n";
 
my @records;
while ( $line = <EXP_IMP_FILE> ) {
    if ($#records < 5000){
        push @records, $line;
    }
    else{
        #process records here, there will be 5000 of them in @records
        
        #then clear records
        @records=();
    }
}#while
 
#If there are any records here, @records will be non-empty.
#You don't have 5000 though... if you want to process these, do so here

Open in new window

0
khanzada19Author Commented:
I am sorry but I dont follow what you mean, when do print of $#records i am getting -1.
0
Adam314Commented:
$#records is the highest index in the array.  If it is -1, you don't have any records.  Where are you printing it?
0
Adam314Commented:
Oh.... the file handle name you have is incorrect, try this, replace line 1 above with this:



open(EXP_IMP_FILE, "< $PathFileName") or die "Could not open file: $!\n";

Open in new window

0
khanzada19Author Commented:
The file handle name  is correct I forgot to change it when i did the cut and paste. Currently there are more than 500,000 records in the E_P_FILE, I want to first select 5000 records insert them into a table and do some processing and than after processing is perl would pick another 5000 records insert to a table and process them .... and keep going until i am done processing all the records.
0
Adam314Commented:
>>insert them into a table
Is this a database table, or some table you are keeping in RAM?  If it is a database table, there is no need to store the records in @records, you can simply do an insert.  Then after 5000 records, you can process them from the DB.
Note that if you insert your second set of 5000 records to the same table as the first, they will all be there together.
0
khanzada19Author Commented:
I delete them as soon as i process them, but I am still not getting 5000 records at once, I get 1 and keep incremanting. What I did know is follwoing. it seems to be working. I was look to do a better way.

while ( $line = <EXP_IMP_FILE> )
$count = $count +1;
       if ($count eq 5000){
        insert
       process
      delete
      $count = 0;
       }
  }#while
0
ozoCommented:
> What I did know is follwoing. it seems to be working. I was look to do a better way.
What would you consider to be better?
0
Adam314Commented:
What you posted will only process every 5000'th record.  It will not insert all records, and do processing on them in groups of 5000.  Is this what you meant you wanted?
0
SujithData ArchitectCommented:
I guess khanzada19 is looking for bulk reading of the file in batches.
But even if you read them in batches, i dont think there is a bulk table loading feature available in PERL.
0
khanzada19Author Commented:
Yes, that's i want to process in 5000 records chunks.
0
Adam314Commented:

while ( $line = <EXP_IMP_FILE> )
    #insert
    if(!($. % 5000) or eof(EXP_IMP_FILE)) {
        #process
        #delete
    }
}

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Perl

From novice to tech pro — start learning today.