fabiano petrone
asked on
best way of fetching some records
hello,
please find attached a list of various records resulting from various databases.
what is in your opinion the best way for extracting only the records of the form:
***BEGIN OF RECORD***
TITLE: Interacting with computers
PACKAGE: Elsevier SD Freedom Collection:Full Text
***END OF RECORD***
(I.E. with only TITLE & PACKAGE fields)?
I guess a perl script...but also other scripting languages will be good...
Thanks a Lot for your help,
fabiano
records.txt
please find attached a list of various records resulting from various databases.
what is in your opinion the best way for extracting only the records of the form:
***BEGIN OF RECORD***
TITLE: Interacting with computers
PACKAGE: Elsevier SD Freedom Collection:Full Text
***END OF RECORD***
(I.E. with only TITLE & PACKAGE fields)?
I guess a perl script...but also other scripting languages will be good...
Thanks a Lot for your help,
fabiano
records.txt
ASKER
Hello, Wilcoxon
Before all, Thanks a lot for your reply.
I've tried the following code but still something goes wrong...can you help me?
Fabiano
Before all, Thanks a lot for your reply.
I've tried the following code but still something goes wrong...can you help me?
#c:\perl\bin\perl.exe
use strict;
my $in_file = "records.txt";
my $out_file = "results.txt";
open INFILE, "< $in_file" or die "Can't open $in_file $!";
open OUTFILE, "> $out_file" or die "Can't open $out_file $!";
while (<INFILE>) {
# assume $in_file contains the record you want to check
unless ($in_file =~ m{\*+BEGIN OF RECORD\*+\s+TITLE: ([^\n]+)\s*PACKAGE: ([^\n]+)\s*\*+END OF RECORD\*}ms) {
next; # or return or something else to skip processing
}
my ($title, $pkg) = ($1, $2);
# do whatever you want with the values
print OUTFILE $title, $pkg, "\n";
}
Thanks again, Fabiano
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Hi fabianope65,
Here are the first 17 lines of your file:
What do you want done with records which don't start with a "***BEGIN OF RECORD***" marker? See the last part of the above extract, for an example.
Thanks.
tel2
Here are the first 17 lines of your file:
***BEGIN OF RECORD***
TITLE: Publishers weekly
AVAILABILITY: Available from 1997.
PACKAGE: EBSCOhost Business Source Complete:Full Text
***END OF RECORD***
***BEGIN OF RECORD***
TITLE: Communications of the ACM
AVAILABILITY: Available from 1958 volume: 1 issue: 1.
PACKAGE: ACM Digital Library:Full Text
***END OF RECORD***
AVAILABILITY: Available from 1999.
PACKAGE: EBSCOhost Business Source Complete:Full Text
***END OF RECORD***
What do you want done with records which don't start with a "***BEGIN OF RECORD***" marker? See the last part of the above extract, for an example.
Thanks.
tel2
Hi again fabianope65,
Assuming records.txt has Windows style line terminators (i.e. CR+LF), I think this Perl script should work for you:
local $/ = "\r\n\r\n"; # ...
to:
local $/ = '';
Both options should work with any size records.txt file.
And if you can answer the question in my previous post sometime, that would be good.
Also, what OS are you running?
Assuming records.txt has Windows style line terminators (i.e. CR+LF), I think this Perl script should work for you:
{
local $/ = "\r\n\r\n"; # Temporarily change the record terminator to CR+LF+CR+LF (i.e. paragraph mode)
open INFILE, "<records.txt" or die "Can't open records.txt";
while (<INFILE>)
{
next unless ($title, $pkg) = $_ =~ /\*\*\*BEGIN OF RECORD\*\*\*.*?\sTITLE: (.+?)\n.*?PACKAGE: (.+?)\n.*?\*\*\*END OF RECORD\*\*\*/s;
# Do whatever you want with $title & $pkg
}
}
Or if records.txt has UNIX style line terminators, change:local $/ = "\r\n\r\n"; # ...
to:
local $/ = '';
Both options should work with any size records.txt file.
And if you can answer the question in my previous post sometime, that would be good.
Also, what OS are you running?
ASKER
hi, I'm now on a windows 7 + activeperl 5.16.3.1604
thanks
fabiano
thanks
fabiano
ASKER
hello,
thanks a lot to both of you...the wilcoxon script works perfectly thanks again, fabiano
thanks a lot to both of you...the wilcoxon script works perfectly thanks again, fabiano
Open in new window
Let me know if you need more help. If you do, please provide more details (such as are the files plain text files, how big are the files, what do you want to do with the values extracted, etc).