best way of fetching some records

hello,
please find attached a list of various records resulting from various databases.
what is in your opinion the best way for extracting only the records of the form:

***BEGIN OF RECORD***
TITLE: Interacting with computers
PACKAGE: Elsevier SD Freedom Collection:Full Text
***END OF RECORD***

(I.E. with only TITLE & PACKAGE fields)?

I guess a perl script...but also other scripting languages will be good...

Thanks a Lot for your help,
fabiano
records.txt
fabiano petroneAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

wilcoxonCommented:
You could do this in a simple regex.  Here is Perl code if you want only records with TITLE and PACKAGE (and nothing else).
# assume $string contains the record you want to check
unless ($string =~ m{\*+BEGIN OF RECORD\*+\s+TITLE: ([^\n]+)\s*PACKAGE: ([^\n]+)\s*\*+END OF RECORD\*}ms) {
    next; # or return or something else to skip processing
}
my ($title, $pkg) = ($1, $2)
# do whatever you want with the values

Open in new window

Let me know if you need more help.  If you do, please provide more details (such as are the files plain text files, how big are the files, what do you want to do with the values extracted, etc).
0
fabiano petroneAuthor Commented:
Hello, Wilcoxon
Before all, Thanks a lot for your reply.
I've tried the following code but still something goes wrong...can you help me?

#c:\perl\bin\perl.exe
use strict;
my $in_file = "records.txt";
my $out_file = "results.txt";

open INFILE, "< $in_file" or die "Can't open $in_file $!";
open OUTFILE, "> $out_file" or die "Can't open $out_file $!";

while (<INFILE>) {
# assume $in_file contains the record you want to check
unless ($in_file =~ m{\*+BEGIN OF RECORD\*+\s+TITLE: ([^\n]+)\s*PACKAGE: ([^\n]+)\s*\*+END OF RECORD\*}ms) {
    next; # or return or something else to skip processing
}
my ($title, $pkg) = ($1, $2);
# do whatever you want with the values
  print OUTFILE $title, $pkg, "\n";
}

Open in new window

Thanks again,
Fabiano
0
wilcoxonCommented:
Here's one way to do it that will work regardless of size of file.  If the files are always smallish, I'd maybe look at using File::Slurp.
use strict;
use warnings;
use Fcntl qw(O_RDONLY);
use Tie::File;
tie my @file, 'Tie::File', 'records.txt', mode => O_RDONLY or die "could not tie records.txt: $!";
open OUT, '>', 'results.txt' or die "could not write results.txt: $!';
for my $i (0..@file-4) {
    if ($file[$i] =~ m{\*+BEGIN OF RECORD\*+}
        and my ($ttl) = $file[$i+1] =~ m{^\s*TITLE:\s*(.+)}
        and my ($pkg) = $file[$i+2] =~ m{^\s*PACKAGE:\s*(.+)}
        and $file[$i+3] =~ m{\*+END OF RECORD\*+}) {
        print OUT $ttl, '  ', $pkg, "\n";
        $i += 4;
    } else {
        $i++;
    }
}
close OUT;

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Cloud Class® Course: Certified Penetration Testing

This CPTE Certified Penetration Testing Engineer course covers everything you need to know about becoming a Certified Penetration Testing Engineer. Career Path: Professional roles include Ethical Hackers, Security Consultants, System Administrators, and Chief Security Officers.

tel2Commented:
Hi fabianope65,

Here are the first 17 lines of your file:
***BEGIN OF RECORD***
TITLE: Publishers weekly
AVAILABILITY: Available from 1997. 
PACKAGE: EBSCOhost Business Source Complete:Full Text
***END OF RECORD***


***BEGIN OF RECORD***
TITLE: Communications of the ACM
AVAILABILITY: Available from 1958 volume: 1 issue: 1. 
PACKAGE: ACM Digital Library:Full Text
***END OF RECORD***


AVAILABILITY: Available from 1999. 
PACKAGE: EBSCOhost Business Source Complete:Full Text
***END OF RECORD***

Open in new window


What do you want done with records which don't start with a "***BEGIN OF RECORD***" marker?  See the last part of the above extract, for an example.

Thanks.
tel2
0
tel2Commented:
Hi again fabianope65,

Assuming records.txt has Windows style line terminators (i.e. CR+LF), I think this Perl script should work for you:
{
  local $/ = "\r\n\r\n";  # Temporarily change the record terminator to CR+LF+CR+LF (i.e. paragraph mode)
  open INFILE, "<records.txt" or die "Can't open records.txt";
  while (<INFILE>)
  {
    next unless ($title, $pkg) = $_ =~ /\*\*\*BEGIN OF RECORD\*\*\*.*?\sTITLE: (.+?)\n.*?PACKAGE: (.+?)\n.*?\*\*\*END OF RECORD\*\*\*/s;
    # Do whatever you want with $title & $pkg
  }
}

Open in new window

Or if records.txt has UNIX style line terminators, change:
    local $/ = "\r\n\r\n";  # ...
to:
    local $/ = '';

Both options should work with any size records.txt file.

And if you can answer the question in my previous post sometime, that would be good.
Also, what OS are you running?
0
fabiano petroneAuthor Commented:
hi, I'm now on a windows 7 + activeperl 5.16.3.1604
thanks
fabiano
0
fabiano petroneAuthor Commented:
hello,
thanks a lot to both of you...the wilcoxon script works perfectly thanks again, fabiano
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Perl

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.