Link to home
Start Free TrialLog in
Avatar of ranadhir
ranadhir

asked on

perl pattern parsing

I am a newbie in perl.
I would need help in parsing following like entries present in a file,and putting them in holder variables $var,$val,$val1....etc.
FRUIT=APPLE(trapping into $var and $val)
FRUIT=APPLE&ORANGE(trapping into $var,$val1[as APPLE] and $val2[as ORANGE])
FRUIT=(APPLE(ORANGE)(BANANA))(trapping into $var,$val1[as APPLE],$val2[as ORANGE],$val3[as BANANA] )
FRUIT=(GOOD FOR HEALTH(AND DELICIOS TOO)(BEST=APPLE))(trapping into $var and $val[as APPLE] where, BEST as key is pre-decided)
Any help will be appreciated.
Avatar of Adam314
Adam314

In this case:
    FRUIT=(GOOD FOR HEALTH(AND DELICIOS TOO)(BEST=APPLE))
Why don't you want "GOOD FOR HEALTH" and "AND DELICIOS TOO"?  Something like this should get you started:

my @lines = 
('FRUIT=APPLE',
 'FRUIT=APPLE&ORANGE',
 'FRUIT=(APPLE(ORANGE)(BANANA))',
 'FRUIT=(GOOD FOR HEALTH(AND DELICIOS TOO)(BEST=APPLE))');
 
foreach (@lines) {
	my ($key, $allvalues) = /^(.*?)=(.*)$/;
	my @values = grep {$_} split(/[\(\)&]+/, $allvalues);
	print "Key=$key\nValues=" . join(", ", @values) . "\n";
	
	print "\n";
}

Open in new window

Avatar of ozo
you don't need \ before ( or ) in []

and I also don't understand why you don't get $val1 = "GOOD FOR HEALTH" or $val1 = "good"
and fir that mater, in FRUIT=APPLE
why is it $val nd not $val1"
and is it $var="FRUIT" and $val="APPLE"? you did not really specify.
For that matter, you dod not specify what $var should be in any of those cases.
Avatar of ranadhir

ASKER

$var will be 'FRUIT' in all cases .
For the last case FRUIT=(GOOD FOR HEALTH(AND DELICIOS TOO)(BEST=APPLE))
I need $var[as FRUIT] and $val=APPLE.We can assume 'BEST' to be a constant.
You could use the naming convention as $var for the key and $val1,$val2,$val3....... depending on how many values we want to strip out in each used-case above.

# this assigns $val only if (BEST= is seen and assigns as many of $val1,$val2,$val3 as are seen
for(
'FRUIT=APPLE',
 'FRUIT=APPLE&ORANGE',
 'FRUIT=(APPLE(ORANGE)(BANANA))',
 'FRUIT=(GOOD FOR HEALTH(AND DELICIOS TOO)(BEST=APPLE))',
){
  my($var,$val,$val1,$val2,$val3)=/(\w+)=(?=.*\(BEST=([^)]*))?\W*(\w+)\W*(\w*)\W*(\w*)/;
  print "$_\n\$var=$var\n\$val=$val\n\$val1=$val1\n\$val2=$val2\n\$val3=$val3\n\n";
}
i did not intend this to be so complicated - i was taking of 4 different used-cases as a knowledge-base.
so we cn surely have differnt answers for the 4 used-cases.
They need not be a part of a single loop
Please split tehm up int o4 different answers for the 4 used-cases - which gives an idea of how exactly the grouping of pattern works
In ozo's latest post, this line:
    my($var,$val,$val1,$val2,$val3)=/(\w+)=(?=.*\(BEST=([^)]*))?\W*(\w+)\W*(\w*)\W*(\w*)/;
is what does the actual parsing.  The for loop is there so that each of your use cases can be tested, and then the results are printed.  If you have just 1 string you want to test, you could test it with:
    $_ = 'FRUIT=APPLE';     #Or whatever your string is
    my($var,$val,$val1,$val2,$val3)=/(\w+)=(?=.*\(BEST=([^)]*))?\W*(\w+)\W*(\w*)\W*(\w*)/;

If you don't want your string in $_, you can use another variable.  You'll need a small change to the RE:
    my $str = 'FRUIT=APPLE';     #Or whatever your string is
    my($var,$val,$val1,$val2,$val3) = $str =~ /(\w+)=(?=.*\(BEST=([^)]*))?\W*(\w+)\W*(\w*)\W*(\w*)/;
Actually having a all-encompassing solution like this is quite a scary  thing for a newbie like me.
So i had requested for a step-by-step approach  - a custom solution for each of the simple used-cases i mentioned.
Once i get through understanding each one of them,i can myself make sense of the one given above too.
How do you want to determine which use case you have?  Here is an example that looks for the exact string, but I don't think it'll be very useful.

my $str = 'FRUIT=APPLE';   #Or whatever you want
 
my ($var, $val1, $val2, $val3);
if   ($str eq 'FRUIT=APPLE') {$var='FRUIT';$val1='APPLE';}
elsif($str eq 'FRUIT=APPLE&ORANGE') {$var='FRUIT'; $val1='APPLE'; $val2='ORANGE';}
elsif($str eq 'FRUIT=(APPLE(ORANGE)(BANANA))') {$var='FRUIT'; $val1='APPLE'; $val2='ORANGE'; $val3='BANANA';}
elsif($str eq 'FRUIT=(GOOD FOR HEALTH(AND DELICIOS TOO)(BEST=APPLE))') {$var='FRUIT'; $val1='APPLE';}
else {warn "Unknown string\n";}

Open in new window

what i meant is that assume we have 4 files with all entreis of one particular type - 1 file with all inputs as the first used-case:
FRUIT=APPLE
VEGETABLE=CARROT
COLOR=YELLOW

the second file with entires like:
'FRUIT=APPLE&ORANGE'
VEGETABLE=CARROT&TOMATO
COLOR=YELLOW&RED

and so on for the other 2 used-cases.
I intended to request for a parsing routine for all of these files individually
 
ASKER CERTIFIED SOLUTION
Avatar of ozo
ozo
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial