Solved

Using File::Tie to open file grab strng and search for it on different file

Posted on 2009-06-28
2
312 Views
Last Modified: 2012-05-07
In perl how can use File::Tie to to open file1 read the first line and search for that string in file2, if found print out "found string" if not found print out "Not found".  Then continue to the second string in file1 searching for it in file2.  So on so forth until no more lines in file1.
I don't want to read the whole file to memory because both of these files might get very large in size.
0
Comment
Question by:warrior32
2 Comments
 
LVL 84

Accepted Solution

by:
ozo earned 250 total points
ID: 24733127
  "memory"
       This is an upper limit on the amount of memory that "Tie::File" will
       consume at any time while managing the file.  This is used for two
       things: managing the read cache and managing the deferred write buffer.

       Records read in from the file are cached, to avoid having to re-read
       them repeatedly.  If you read the same record twice, the first time it
       will be stored in memory, and the second time it will be fetched from
       the read cache.  The amount of data in the read cache will not exceed
       the value you specified for "memory".  If "Tie::File" wants to cache a
       new record, but the read cache is full, it will make room by expiring
       the least-recently visited records from the read cache.

       The default memory limit is 2Mib.  You can adjust the maximum read
       cache size by supplying the "memory" option.  The argument is the
       desired cache size, in bytes.

               # I have a lot of memory, so use a large cache to speed up access
               tie @array, 'Tie::File', $file, memory => 20_000_000;

       Setting the memory limit to 0 will inhibit caching; records will be
       fetched from disk every time you examine them.

       The "memory" value is not an absolute or exact limit on the memory
       used.  "Tie::File" objects contains some structures besides the read
       cache and the deferred write buffer, whose sizes are not charged
       against "memory".

       The cache itself consumes about 310 bytes per cached record, so if your
       file has many short records, you may want to decrease the cache memory
       limit, or else the cache overhead may exceed the size of the cached
       data.

But there seems to be no reason to reason to read file1 into memory at all
file2 would be the file that would be useful to read into memory so you can index into a hash instead of search a file.  But if you don't have enough memory for that, you may prefer to use Tie::Hash, or perhaps to use a DBI
0
 
LVL 39

Expert Comment

by:Adam314
ID: 24787570
I don't see any need for Tie::File for this, you can just read the file one line at a time, and search for the string.
open(my $fh1, '<file1.txt') or die "could not open file1: $!\n";

open(my $fh2, '<file2.txt') or die "Could not open file2: $!\n";

while(my $s1=<$fh1>) {

	chomp $s1;

	seek $fh2,0,0;

	my $found=0;

	while(<$fh2>) {

		next unless /$s1/;

		print "$s1: found string\n";

		$found=1;

		last;

	}

	print "$s1: Not found\n" unless $found;

}

close($fh1);

close($fh2);

Open in new window

0

Featured Post

Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

Join & Write a Comment

Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
There are many situations when we need to display the data in sorted order. For example: Student details by name or by rank or by total marks etc. If you are working on data driven based projects then you will use sorting techniques very frequently.…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now