`grep...` v grep /.../

I have an application that extracts a relatively few lines from a relatively large file.  Right now my code reads:

@records=`grep $whatToLookFor theFile`;

I did it this way rather than:

open(FIL,"theFile");
@records=grep /$whatToLookFor/,<FIL>;
close(FIL);

since I was wrried that perl may actually bring in the entire file and then do the grep.  Cleraly my method has the disadvantage of using an extra shell invocation.  If I replace with the =grep/.../ will perl read the file line by line and do the greap, or will it be as horrible as I was concerned about?
LVL 8
jhurstAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

ozoCommented:
grep /$whatToLookFor/,<FIL>;#will bring in the entire file and then do the grep

while( <FIL> ){
    push @records,$_ if /$whatToLookFor/; #will read lines one at a time and only store the matching lines in @record
}
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
jhurstAuthor Commented:
Thanks ozo.

You not only confirmed my fears but even suggested a better alternative.

However, I have tested and with a 40k line file the `grep...` is MUCH faster than the while() solution.  I guess that grep is more efficiently written than the perl interpretter.

BTW, how do you know this?  I looked everywhere I could think of and could see  no documentation as to why the
@x=grep /whetaver/,<IN>;
would pull in the whole file first.

It sort of made sense to me that it would be as you sugegsted since the alternative requires two processes and a pipe but I still could not find it documented.
0
burtdavCommented:
(@x=grep /whetaver/,<IN>) pulls the whole file, as it uses the filehandle in list context. (print <IN>) does this, too. If you want to read a single line, you have to be careful to use the <> operator in scalar context.
0
Cloud Class® Course: Amazon Web Services - Basic

Are you thinking about creating an Amazon Web Services account for your business? Not sure where to start? In this course you’ll get an overview of the history of AWS and take a tour of their user interface.

jhurstAuthor Commented:
Where are you finding that in the docs burtdav?  It seems VERY unlikely.
0
burtdavCommented:
I'm certain it's true. I probably read it in an O'Reilly camel-book, but it must be in the perldoc. Wait...
Thanks, Google.
It's in perlintro, second paragraph of "Files and I/O":
http://www.perldoc.com/perl5.8.0/pod/perlintro.html#Files-and-I-O
0
jhurstAuthor Commented:
Great, thanks.

What is interesting is that I have previously, and just repeated, experiments and they sure appear to show that this is not the case.
0
burtdavCommented:
Please post them; I've not found an exception.
0
burtdavCommented:
I mean, if perlintro is wrong, what hope can there be? Is that too philosophical for a tech TA?
0
jhurstAuthor Commented:
ok, you can repeat my test, it is somewhat "non-scientific".  I just happened to have a large file, 260M of voter registration data.  Opened it on IN and then did the grep as shown above.

While running it I used top to see memory and cpu useage.  Repeated the thing with the file that is produced by the grep but @data=<IN>;, no grep.

In the latter case much more memory was used.

I should add that my testing appears to indicates that
@x=`grep pattern file`; is by ar the most efficient on a 2G pentium running Linux.  I am assuming that this is the case because the grep is more efficient than the perl grep and probably because you are right and my perl scripts do load the whole file.  I have only 512M of ram on the machine, btw.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Perl

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.