Link to home
Start Free TrialLog in
Avatar of johnny99
johnny99

asked on

PERL and Text Files -- My Head Hurts

I'd really like someone to explain to me, in words that a dummy like me can understand, how Perl reads or handles text files.

Specifically, I don't understand what that

     while(<INFILE>);

thing means.

Say I was trying to write a script which would pick a random line from a 100-line text file.

I open the file with an

     or die "can't find file"

error handler, I understand it this far, but then in order to do something with the file I need

     while(<INFILE>){
     do my stuff
     }
     
Which I don't really understand -- does while(<INFILE>) mean "read every line of the file"? Does it mean "put every line of the file into $_ one by one"? Does it mean "keep on reading the file until some condition is satisfied, at which time whatever line I'm reading will be placed in $_"?

If I want to generate a random number, then pick that line from my 100 lines, how do I script that? And if I wanted to repeat the "get a random line" process a random number of times, how would I script that?

The trouble is that in my head the process goes like this:
   
     Generate a random number X
     Open the file
     Get line X of the file
     
Which ought to be simple, but PERL doesn't seem to do it like that. As far as I can work out from the book I have, it has to read each line of the file into memory and throw them away before it gets to the one it wants.

And if I want to repeat the process, getting a random line from my file Y number of times, I see it as:

     Generate a random number Y
     Repeat Y number of times
          Generate a random number X
          Get line X of the file
     Stop repeating
     
And again, maybe it's just me, but I'm finding that difficult. DO I have to open the file every time? Do I have to do that while(<INFILE>) thing every time or just once?

I'll gladly give stacks of points to someone who can make me understand how this all works, because it's making my head hurt...
Avatar of martinag
martinag

>> while(<INFILE>){
>>   do my stuff
>> }

You're absolutely correct. While there are one or more lines left in the file to read it willl be put in $_ and true is returned. When true is returned it tells the while() statement to run once more. When end of file is reached, false is returned and the while() statement won't be run any more.

srand; // Gives better random numbers
$lineNumber = int(rand 100) + 1; // Get a number between 1 and 100
for ($i=1;$i<=$lineNumber;$i++)
  <INFILE>;
$thatLine = $_;

will do the trick for you. I am not using a while() here. Insted I use a for loop.
It starts by setting $i to 1. Then it is run as long as $i is lower than or equal to (<=) $lineNumber. Every time the loop has run $i is incremented by one (++).

Then, when the lines have been read, the last that was read ($_) will be put in $thatLine.

>> Generate a random number Y
>> Repeat Y number of times
>>   Generate a random number X
>>   Get line X of the file
>> Stop repeating

Correct.

>> DO I have to open the file every time? Do I have to do that while(<INFILE>) thing every time or just once?

Not really actually. You can put the lines in an array and use it every time which means you won't have to re-read the file:
srand;
times = int(rand 10) + 1 // Between 1 and 10
open (INFILE, "<file.txt") or die "open file: $!";
$lineCount = 0;
while (<INFILE>)
  $lines[$lineCount++] = $_; // Add line to array and increment $lineCount
close INFILE;
for ($i=0;$i<$times;$i++)
  $randomedLines[$i] = $lines[int(rand 100)+1];

The lines will now be put in @randomedLines.

Martin
Avatar of ozo
perldoc -q 'random line'
Found in perlfaq5.pod
  How do I select a random line from a file?

            Here's an algorithm from the Camel Book:

                srand;
                rand($.) < 1 && ($line = $_) while <>;

            This has a significant advantage in space over reading
            the whole file in. A simple proof by induction is
            available upon request if you doubt its correctness.

                srand;
                rand($.) < 1 && ($line = $_) while <>;
@lines = <INFILE>; #grab all lines from the file

# fisher_yates_shuffle( \@array ) :
# generate a random permutation of @array in place
sub fisher_yates_shuffle {
        my $array = shift;
        my $i;
        for( $i = @$array; --$i; ){
            my $j = int rand ($i+1);
            next if $i == $j;
            @$array[$i,$j] = @$array[$j,$i];
        }
}

fisher_yates_shuffle( \@lines );  # permutes @lines
Avatar of johnny99

ASKER

Martinag, you can certainly have the points, for being the only person to even try and answer the question in the way it was asked, i.e. slowly and carefully.

Other people who replied, I'm sure your stuff is very clever, but the information about what all of that stuff means is more valuable than any amount of neat solutions, whether they work or not...

Can you tell me a good place on the web or a good book to help me with questions like this?
perldoc -q book
Found in perlfaq2.pod
  Perl Books

            A number of books on Perl and/or CGI programming are
            available. A few of these are good, some are ok, but
            many aren't worth your money. Tom Christiansen maintains
            a list of these books, some with extensive reviews, at
            http://www.perl.com/perl/critiques/index.html.

            The incontestably definitive reference book on Perl,
            written by the creator of Perl, is now in its second
            edition:

                Programming Perl (the "Camel Book"):
                    Authors: Larry Wall, Tom Christiansen, and Randal Schwartz
                    ISBN 1-56592-149-6      (English)
                    ISBN 4-89052-384-7      (Japanese)
                    URL: http://www.oreilly.com/catalog/pperl2/
                (French, German, Italian, and Hungarian translations also
                available)

            The companion volume to the Camel containing thousands
            of real-world examples, mini-tutorials, and complete
            programs (first premiering at the 1998 Perl Conference),
            is:

                The Perl Cookbook (the "Ram Book"):
                    Authors: Tom Christiansen and Nathan Torkington,
                                with Foreword by Larry Wall
                    ISBN: 1-56592-243-3
                    URL:  http://perl.oreilly.com/cookbook/

            If you're already a hard-core systems programmer, then
            the Camel Book might suffice for you to learn Perl from.
            But if you're not, check out:

                Learning Perl (the "Llama Book"):
                    Authors: Randal Schwartz and Tom Christiansen
                                with Foreword by Larry Wall
                    ISBN: 1-56592-284-0
                    URL:  http://www.oreilly.com/catalog/lperl2/

            Despite the picture at the URL above, the second edition
            of "Llama Book" really has a blue cover, and is updated
            for the 5.004 release of Perl. Various foreign language
            editions are available, including *Learning Perl on
            Win32 Systems* (the Gecko Book).

            If you're not an accidental programmer, but a more
            serious and possibly even degreed computer scientist who
            doesn't need as much hand-holding as we try to provide
            in the Llama or its defurred cousin the Gecko, please
            check out the delightful book, *Perl: The Programmer's
            Companion*, written by Nigel Chapman.

            You can order O'Reilly books directly from O'Reilly &
            Associates, 1-800-998-9938. Local/overseas is 1-707-829-
            0515. If you can locate an O'Reilly order form, you can
            also fax to 1-707-829-0104. See http://www.ora.com/ on
            the Web.

            What follows is a list of the books that the FAQ authors
            found personally useful. Your mileage may (but, we hope,
            probably won't) vary.

            Recommended books on (or muchly on) Perl follow; those
            marked with a star may be ordered from O'Reilly.

    References  *Programming Perl by Larry Wall, Tom Christiansen, and
                Randal L. Schwartz

                    *Perl 5 Desktop Reference
                        By Johan Vromans

    Tutorials
                   
        *Learning Perl [2nd edition]
            by Randal L. Schwartz and Tom Christiansen
                with foreword by Larry Wall
                    *Learning Perl on Win32 Systems
                        by Randal L. Schwartz, Erik Olson, and Tom Christiansen,
                            with foreword by Larry Wall

                    Perl: The Programmer's Companion
                        by Nigel Chapman

                    Cross-Platform Perl
                        by Eric F. Johnson

                    MacPerl: Power and Ease
                        by Vicki Brown and Chris Nandor, foreword by Matthias Neeracher

    Task-Oriented
                    *The Perl Cookbook
                        by Tom Christiansen and Nathan Torkington
                            with foreword by Larry Wall

                    Perl5 Interactive Course [2nd edition]
                        by Jon Orwant

                    *Advanced Perl Programming
                        by Sriram Srinivasan

                    Effective Perl Programming
                        by Joseph Hall

    Special Topics
                    *Mastering Regular Expressions
                        by Jeffrey Friedl

                    How to Set up and Maintain a World Wide Web Site [2nd edition]
                        by Lincoln Stein


Thanks Ozo -- it sounds as if I'm a lama, rather than a camel, at this stage.

johnny "that's pronounced 'lay-muh', right?" 99
perldoc
will also give you access to much of the information in the camel,
as well as answers to many common questions.

  while( <INFILE> ){ }
is actually a magic shorthand for
  while( defined($_=<INFILE>) ){ }
this is covered in
perldoc perlop
under I/O Operators
ASKER CERTIFIED SOLUTION
Avatar of martinag
martinag

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial