yelbow
asked on
perl extract "subfields" from text string
Hi,
I have a file with contents as follows:
There will be several of these, but have just included one for clarity.
For each line, I'm wanting to split the data into variables/fields using a perl script. Each field is delimited by "$$". So, for example, from the above I'm wanting to end up with:
Equally, if, say, the line was as follows:
I would want to end up with:
i.e. if a field is not present (in this case $$b)
I'm currently doing that as follows (while looping round the file)
But I'm guessing there is a far more efficient way to be achievening this?
I have a file with contents as follows:
000000001 ZZZ $$aTextinA$$bTextinB$$cTex tinC
There will be several of these, but have just included one for clarity.
For each line, I'm wanting to split the data into variables/fields using a perl script. Each field is delimited by "$$". So, for example, from the above I'm wanting to end up with:
$subfield_a = "TextinA"
$subfield_b = "TextinB"
$subfield_c = "TextinC"
$subfield_b = "TextinB"
$subfield_c = "TextinC"
Equally, if, say, the line was as follows:
000000001 ZZZ $$aTextinA$$cTextinC
I would want to end up with:
$subfield_a = "TextinA"
$subfield_b = ""
$subfield_c = "TextinC"
$subfield_b = ""
$subfield_c = "TextinC"
i.e. if a field is not present (in this case $$b)
I'm currently doing that as follows (while looping round the file)
@fields = split ('\$\$',$line);
@suba = grep {/^a/} @fields;
if ( $suba[0] =~ /^a(.*)/ ) {
$subfield_a = $1;
} else {
$subfield_a = "";
}
@subb = grep {/^b/} @fields;
if ( $subb[0] =~ /^b(.*)/ ) {
$subfield_b = $1;
} else {
$subfield_b = "";
}
@subc = grep {/^c/} @fields;
if ( $subc[0] =~ /^c(.*)/ ) {
$subfield_c = $1;
} else {
$subfield_c = "";
}
But I'm guessing there is a far more efficient way to be achievening this?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
my %fields = $line=~/\$\$(\w)([^\$\n]*)/g;
Or, if you really want the variables in $subfield_a, $subfield_b, $subfield_c:
${"subfield_$1"}=$2 while $line=~/\$\$(.)([^\$\n]*)/
but I would not recommend that method.
better might be
${${{a=>\$subfield_a,b=>\$
better still would be to use %fields instead of $subfield_a, $subfield_b, $subfield_c
ASKER