?
Solved

Opening a large file in Perl -> "Out of Memory!" error

Posted on 2003-03-20
4
Medium Priority
?
431 Views
Last Modified: 2008-01-09
I am attempting to open a 24MB file using Perl. Below is the code that I am attempting to use:

my $file = 'bigfile.temp';
open(FILE,"<$file") || die "Cannot open $file: $!";
my $sgml = join('',<FILE>); # The program dies on this line
close(FILE);
print "Read ".length($sgml)." characters from $file\n";

The program runs for a while, starts accessing the disk (paging from memory), and then halts. The error message printed is simply:

Out of memory!

I don't see why Perl would crash on such a problem. I identified the line that causes the crash by using output statements, which have been removed. I have also attempted a different solution using:

foreach my $line (<FILE>) ...

However the alternate solution has the same problem. The content of the loop is not executed even once. Can anyone shed light on why Perl would crash like this?

Other Information:
------------------
I am running Perl version 5.6.0 on Windows 2000 with 512MB of physical memory.
The content of the file is SGML.
The file was written using Perl (external to the previous script).
The file can be viewed through a text editor (Textpad), so I assume that the file is not corrupted.
Windows Task Manager indicates that the memory usage of Perl peaks at 80MB, then drops off to 15MB, then rises back up to 80MB, at which point Perl halts with the "Out of Memory!" error.
0
Comment
Question by:swackerl
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
4 Comments
 
LVL 5

Accepted Solution

by:
Sapa earned 750 total points
ID: 8175151
This line read all 24 Megabytes into memory (really even twice - into anonymous list, and join this list into scalar variable $sgml):

my $sgml = join('',<FILE>); # The program dies on this line

this line:

foreach my $line (<FILE>) ...

read list of all lines into memory too.
If you need sequential reading use:

while (defined(my $line = <FILE>)) { ...

instead.

--
Andrey
0
 
LVL 5

Expert Comment

by:Sapa
ID: 8175233
P.S. list of lines with total length of 24 MB occupied even more than 24 MB in memory. Each row (elementh of list) need some additional space for internal data.

You can try to load file into scalar variable directly:

# try to read 1 billion bytes into scalar var
# $nread contains number of bytes have really read
my $nread = read(FILE, my $sgml, 1000000000);

--
Andrey
0
 

Author Comment

by:swackerl
ID: 8175323
Any idea why Perl would crash with such low memory usage? I realize that the data gets copied several times, but even if there are 4 copies of the file and each copy takes up double the space that it does on disk, that's still under 200MB. With 512MB of physical memory (and even more if virtual memory is used), I don't see the reason why Perl would complain that it was "Out of Memory!".

Thanks for the quick answer.
0
 
LVL 5

Expert Comment

by:Sapa
ID: 8175381
may be the user launching this script have small memory limit? As 'Administrator' I can create such big data structures w/o problem. I am using ActivePerl build 626 (5.6.1) on Windows 2000 with 256MB only.

--
Andrey
0

Featured Post

On Demand Webinar: Networking for the Cloud Era

Ready to improve network connectivity? Watch this webinar to learn how SD-WANs and a one-click instant connect tool can boost provisions, deployment, and management of your cloud connection.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

On Microsoft Windows, if  when you click or type the name of a .pl file, you get an error "is not recognized as an internal or external command, operable program or batch file", then this means you do not have the .pl file extension associated with …
Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question