Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Using File::Tie to open file grab strng and search for it on different file

Posted on 2009-06-28
2
Medium Priority
?
363 Views
Last Modified: 2012-05-07
In perl how can use File::Tie to to open file1 read the first line and search for that string in file2, if found print out "found string" if not found print out "Not found".  Then continue to the second string in file1 searching for it in file2.  So on so forth until no more lines in file1.
I don't want to read the whole file to memory because both of these files might get very large in size.
0
Comment
Question by:warrior32
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
2 Comments
 
LVL 84

Accepted Solution

by:
ozo earned 1000 total points
ID: 24733127
  "memory"
       This is an upper limit on the amount of memory that "Tie::File" will
       consume at any time while managing the file.  This is used for two
       things: managing the read cache and managing the deferred write buffer.

       Records read in from the file are cached, to avoid having to re-read
       them repeatedly.  If you read the same record twice, the first time it
       will be stored in memory, and the second time it will be fetched from
       the read cache.  The amount of data in the read cache will not exceed
       the value you specified for "memory".  If "Tie::File" wants to cache a
       new record, but the read cache is full, it will make room by expiring
       the least-recently visited records from the read cache.

       The default memory limit is 2Mib.  You can adjust the maximum read
       cache size by supplying the "memory" option.  The argument is the
       desired cache size, in bytes.

               # I have a lot of memory, so use a large cache to speed up access
               tie @array, 'Tie::File', $file, memory => 20_000_000;

       Setting the memory limit to 0 will inhibit caching; records will be
       fetched from disk every time you examine them.

       The "memory" value is not an absolute or exact limit on the memory
       used.  "Tie::File" objects contains some structures besides the read
       cache and the deferred write buffer, whose sizes are not charged
       against "memory".

       The cache itself consumes about 310 bytes per cached record, so if your
       file has many short records, you may want to decrease the cache memory
       limit, or else the cache overhead may exceed the size of the cached
       data.

But there seems to be no reason to reason to read file1 into memory at all
file2 would be the file that would be useful to read into memory so you can index into a hash instead of search a file.  But if you don't have enough memory for that, you may prefer to use Tie::Hash, or perhaps to use a DBI
0
 
LVL 39

Expert Comment

by:Adam314
ID: 24787570
I don't see any need for Tie::File for this, you can just read the file one line at a time, and search for the string.
open(my $fh1, '<file1.txt') or die "could not open file1: $!\n";
open(my $fh2, '<file2.txt') or die "Could not open file2: $!\n";
while(my $s1=<$fh1>) {
	chomp $s1;
	seek $fh2,0,0;
	my $found=0;
	while(<$fh2>) {
		next unless /$s1/;
		print "$s1: found string\n";
		$found=1;
		last;
	}
	print "$s1: Not found\n" unless $found;
}
close($fh1);
close($fh2);

Open in new window

0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

On Microsoft Windows, if  when you click or type the name of a .pl file, you get an error "is not recognized as an internal or external command, operable program or batch file", then this means you do not have the .pl file extension associated with …
A year or so back I was asked to have a play with MongoDB; within half an hour I had downloaded (http://www.mongodb.org/downloads),  installed and started the daemon, and had a console window open. After an hour or two of playing at the command …
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans

610 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question