Solved

PERL and Text Files -- My Head Hurts

Posted on 1998-10-27
8
221 Views
Last Modified: 2010-03-04
I'd really like someone to explain to me, in words that a dummy like me can understand, how Perl reads or handles text files.

Specifically, I don't understand what that

     while(<INFILE>);

thing means.

Say I was trying to write a script which would pick a random line from a 100-line text file.

I open the file with an

     or die "can't find file"

error handler, I understand it this far, but then in order to do something with the file I need

     while(<INFILE>){
     do my stuff
     }
     
Which I don't really understand -- does while(<INFILE>) mean "read every line of the file"? Does it mean "put every line of the file into $_ one by one"? Does it mean "keep on reading the file until some condition is satisfied, at which time whatever line I'm reading will be placed in $_"?

If I want to generate a random number, then pick that line from my 100 lines, how do I script that? And if I wanted to repeat the "get a random line" process a random number of times, how would I script that?

The trouble is that in my head the process goes like this:
   
     Generate a random number X
     Open the file
     Get line X of the file
     
Which ought to be simple, but PERL doesn't seem to do it like that. As far as I can work out from the book I have, it has to read each line of the file into memory and throw them away before it gets to the one it wants.

And if I want to repeat the process, getting a random line from my file Y number of times, I see it as:

     Generate a random number Y
     Repeat Y number of times
          Generate a random number X
          Get line X of the file
     Stop repeating
     
And again, maybe it's just me, but I'm finding that difficult. DO I have to open the file every time? Do I have to do that while(<INFILE>) thing every time or just once?

I'll gladly give stacks of points to someone who can make me understand how this all works, because it's making my head hurt...
0
Comment
Question by:johnny99
  • 4
  • 2
  • 2
8 Comments
 
LVL 4

Expert Comment

by:martinag
Comment Utility
>> while(<INFILE>){
>>   do my stuff
>> }

You're absolutely correct. While there are one or more lines left in the file to read it willl be put in $_ and true is returned. When true is returned it tells the while() statement to run once more. When end of file is reached, false is returned and the while() statement won't be run any more.

srand; // Gives better random numbers
$lineNumber = int(rand 100) + 1; // Get a number between 1 and 100
for ($i=1;$i<=$lineNumber;$i++)
  <INFILE>;
$thatLine = $_;

will do the trick for you. I am not using a while() here. Insted I use a for loop.
It starts by setting $i to 1. Then it is run as long as $i is lower than or equal to (<=) $lineNumber. Every time the loop has run $i is incremented by one (++).

Then, when the lines have been read, the last that was read ($_) will be put in $thatLine.

>> Generate a random number Y
>> Repeat Y number of times
>>   Generate a random number X
>>   Get line X of the file
>> Stop repeating

Correct.

>> DO I have to open the file every time? Do I have to do that while(<INFILE>) thing every time or just once?

Not really actually. You can put the lines in an array and use it every time which means you won't have to re-read the file:
srand;
times = int(rand 10) + 1 // Between 1 and 10
open (INFILE, "<file.txt") or die "open file: $!";
$lineCount = 0;
while (<INFILE>)
  $lines[$lineCount++] = $_; // Add line to array and increment $lineCount
close INFILE;
for ($i=0;$i<$times;$i++)
  $randomedLines[$i] = $lines[int(rand 100)+1];

The lines will now be put in @randomedLines.

Martin
0
 
LVL 84

Expert Comment

by:ozo
Comment Utility
perldoc -q 'random line'
Found in perlfaq5.pod
  How do I select a random line from a file?

            Here's an algorithm from the Camel Book:

                srand;
                rand($.) < 1 && ($line = $_) while <>;

            This has a significant advantage in space over reading
            the whole file in. A simple proof by induction is
            available upon request if you doubt its correctness.

                srand;
                rand($.) < 1 && ($line = $_) while <>;
0
 
LVL 84

Expert Comment

by:ozo
Comment Utility
@lines = <INFILE>; #grab all lines from the file

# fisher_yates_shuffle( \@array ) :
# generate a random permutation of @array in place
sub fisher_yates_shuffle {
        my $array = shift;
        my $i;
        for( $i = @$array; --$i; ){
            my $j = int rand ($i+1);
            next if $i == $j;
            @$array[$i,$j] = @$array[$j,$i];
        }
}

fisher_yates_shuffle( \@lines );  # permutes @lines
0
 
LVL 2

Author Comment

by:johnny99
Comment Utility
Martinag, you can certainly have the points, for being the only person to even try and answer the question in the way it was asked, i.e. slowly and carefully.

Other people who replied, I'm sure your stuff is very clever, but the information about what all of that stuff means is more valuable than any amount of neat solutions, whether they work or not...

Can you tell me a good place on the web or a good book to help me with questions like this?
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 84

Expert Comment

by:ozo
Comment Utility
perldoc -q book
Found in perlfaq2.pod
  Perl Books

            A number of books on Perl and/or CGI programming are
            available. A few of these are good, some are ok, but
            many aren't worth your money. Tom Christiansen maintains
            a list of these books, some with extensive reviews, at
            http://www.perl.com/perl/critiques/index.html.

            The incontestably definitive reference book on Perl,
            written by the creator of Perl, is now in its second
            edition:

                Programming Perl (the "Camel Book"):
                    Authors: Larry Wall, Tom Christiansen, and Randal Schwartz
                    ISBN 1-56592-149-6      (English)
                    ISBN 4-89052-384-7      (Japanese)
                    URL: http://www.oreilly.com/catalog/pperl2/
                (French, German, Italian, and Hungarian translations also
                available)

            The companion volume to the Camel containing thousands
            of real-world examples, mini-tutorials, and complete
            programs (first premiering at the 1998 Perl Conference),
            is:

                The Perl Cookbook (the "Ram Book"):
                    Authors: Tom Christiansen and Nathan Torkington,
                                with Foreword by Larry Wall
                    ISBN: 1-56592-243-3
                    URL:  http://perl.oreilly.com/cookbook/

            If you're already a hard-core systems programmer, then
            the Camel Book might suffice for you to learn Perl from.
            But if you're not, check out:

                Learning Perl (the "Llama Book"):
                    Authors: Randal Schwartz and Tom Christiansen
                                with Foreword by Larry Wall
                    ISBN: 1-56592-284-0
                    URL:  http://www.oreilly.com/catalog/lperl2/

            Despite the picture at the URL above, the second edition
            of "Llama Book" really has a blue cover, and is updated
            for the 5.004 release of Perl. Various foreign language
            editions are available, including *Learning Perl on
            Win32 Systems* (the Gecko Book).

            If you're not an accidental programmer, but a more
            serious and possibly even degreed computer scientist who
            doesn't need as much hand-holding as we try to provide
            in the Llama or its defurred cousin the Gecko, please
            check out the delightful book, *Perl: The Programmer's
            Companion*, written by Nigel Chapman.

            You can order O'Reilly books directly from O'Reilly &
            Associates, 1-800-998-9938. Local/overseas is 1-707-829-
            0515. If you can locate an O'Reilly order form, you can
            also fax to 1-707-829-0104. See http://www.ora.com/ on
            the Web.

            What follows is a list of the books that the FAQ authors
            found personally useful. Your mileage may (but, we hope,
            probably won't) vary.

            Recommended books on (or muchly on) Perl follow; those
            marked with a star may be ordered from O'Reilly.

    References  *Programming Perl by Larry Wall, Tom Christiansen, and
                Randal L. Schwartz

                    *Perl 5 Desktop Reference
                        By Johan Vromans

    Tutorials
                   
        *Learning Perl [2nd edition]
            by Randal L. Schwartz and Tom Christiansen
                with foreword by Larry Wall
                    *Learning Perl on Win32 Systems
                        by Randal L. Schwartz, Erik Olson, and Tom Christiansen,
                            with foreword by Larry Wall

                    Perl: The Programmer's Companion
                        by Nigel Chapman

                    Cross-Platform Perl
                        by Eric F. Johnson

                    MacPerl: Power and Ease
                        by Vicki Brown and Chris Nandor, foreword by Matthias Neeracher

    Task-Oriented
                    *The Perl Cookbook
                        by Tom Christiansen and Nathan Torkington
                            with foreword by Larry Wall

                    Perl5 Interactive Course [2nd edition]
                        by Jon Orwant

                    *Advanced Perl Programming
                        by Sriram Srinivasan

                    Effective Perl Programming
                        by Joseph Hall

    Special Topics
                    *Mastering Regular Expressions
                        by Jeffrey Friedl

                    How to Set up and Maintain a World Wide Web Site [2nd edition]
                        by Lincoln Stein


0
 
LVL 2

Author Comment

by:johnny99
Comment Utility
Thanks Ozo -- it sounds as if I'm a lama, rather than a camel, at this stage.

johnny "that's pronounced 'lay-muh', right?" 99
0
 
LVL 84

Expert Comment

by:ozo
Comment Utility
perldoc
will also give you access to much of the information in the camel,
as well as answers to many common questions.

  while( <INFILE> ){ }
is actually a magic shorthand for
  while( defined($_=<INFILE>) ){ }
this is covered in
perldoc perlop
under I/O Operators
0
 
LVL 4

Accepted Solution

by:
martinag earned 200 total points
Comment Utility
All right, here's an answer.

I hope you will enjoy your perl programming in the future!

Martin
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Suggested Solutions

Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
There are many situations when we need to display the data in sorted order. For example: Student details by name or by rank or by total marks etc. If you are working on data driven based projects then you will use sorting techniques very frequently.…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now