Read large data file via perl in chunks

I have file with more 700,000 records, I want to load 100,000 into a table process them and get next 100,000 until I am done. How would I accomplish file reading chunks via perl (currently I load all the records at once).

Thanks
khanzada19Asked:
Who is Participating?
 
Adam314Connect With a Mentor Commented:

while ( $line = <EXP_IMP_FILE> )
    #insert
    if(!($. % 5000) or eof(EXP_IMP_FILE)) {
        #process
        #delete
    }
}

Open in new window

0
 
Adam314Commented:
Are the records fixed length?  If so, you could use the read function with
    (record size) * (number of records you want to read)

If they are terminated by something (such as newline), set $/ to the end-of-record character, then read 1 record at a time and save it in memory until you have all the records you want.
0
 
khanzada19Author Commented:
records are seperated by ;, could you give me code example. Thanks
0
Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
Adam314Commented:

open(my $in, "<your_file.txt") or die "could not open: $!\n";
local $/=";";  #If you meant ; then newline, use ";\n"
 
my @records;
while(<$in>) {
  if($#records<100_000) {push @records, $_;}
  else {
    #process your records here
    @records=();  #then clear for the next 100_000
  }
}
close($in);

Open in new window

0
 
khanzada19Author Commented:
I am doing following but it is read 1 record at time instead of 5000. what am I doing wrong?

  $ret = open(E_P_FILE, "< $PathFileName");
  local $/=";\n";

my @records;
      while ( $line = <EXP_IMP_FILE> )

       if ($#records < 5000){
         print"\n\n NumberOfLineImport := $NumberOfLineImport\n\n";
       }
       else{
         print"\n\n ELSE NumberOfLineImport := $NumberOfLineImport\n\n";
             @records=();
       }
 }#while
0
 
Adam314Commented:

$ret = open(E_P_FILE, "< $PathFileName");
local $/=";\n";
 
my @records;
while ( $line = <EXP_IMP_FILE> ) {
    if ($#records < 5000){
        push @records, $line;
    }
    else{
        #process records here, there will be 5000 of them in @records
        
        #then clear records
        @records=();
    }
}#while
 
#If there are any records here, @records will be non-empty.
#You don't have 5000 though... if you want to process these, do so here

Open in new window

0
 
khanzada19Author Commented:
I am sorry but I dont follow what you mean, when do print of $#records i am getting -1.
0
 
Adam314Commented:
$#records is the highest index in the array.  If it is -1, you don't have any records.  Where are you printing it?
0
 
Adam314Commented:
Oh.... the file handle name you have is incorrect, try this, replace line 1 above with this:



open(EXP_IMP_FILE, "< $PathFileName") or die "Could not open file: $!\n";

Open in new window

0
 
khanzada19Author Commented:
The file handle name  is correct I forgot to change it when i did the cut and paste. Currently there are more than 500,000 records in the E_P_FILE, I want to first select 5000 records insert them into a table and do some processing and than after processing is perl would pick another 5000 records insert to a table and process them .... and keep going until i am done processing all the records.
0
 
Adam314Commented:
>>insert them into a table
Is this a database table, or some table you are keeping in RAM?  If it is a database table, there is no need to store the records in @records, you can simply do an insert.  Then after 5000 records, you can process them from the DB.
Note that if you insert your second set of 5000 records to the same table as the first, they will all be there together.
0
 
khanzada19Author Commented:
I delete them as soon as i process them, but I am still not getting 5000 records at once, I get 1 and keep incremanting. What I did know is follwoing. it seems to be working. I was look to do a better way.

while ( $line = <EXP_IMP_FILE> )
$count = $count +1;
       if ($count eq 5000){
        insert
       process
      delete
      $count = 0;
       }
  }#while
0
 
ozoCommented:
> What I did know is follwoing. it seems to be working. I was look to do a better way.
What would you consider to be better?
0
 
Adam314Commented:
What you posted will only process every 5000'th record.  It will not insert all records, and do processing on them in groups of 5000.  Is this what you meant you wanted?
0
 
SujithData ArchitectCommented:
I guess khanzada19 is looking for bulk reading of the file in batches.
But even if you read them in batches, i dont think there is a bulk table loading feature available in PERL.
0
 
khanzada19Author Commented:
Yes, that's i want to process in 5000 records chunks.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.