credog
asked on
Perl Remove Portion of String
Given the portion of the perl script below, if the string is found in $variable the output is this:
{MD5}lL5Bo6QZUIcsEuxusMBXM R1YAgCAZF= = {MD5}BY2I75mxmgp1DPwsTYKia rgac7==
If not found $variable will contain a single field something like this:
{MD5}lL5Bo6QZUIcsEuxuABCDE FG1YAgCAZF ==
What I'd like to do if the string is found is to remove it so $variable only contains:
{MD5}lL5Bo6QZUIcsEuxusMBXM R1YAgCAZF= =
Looking for suggestions on the best way to handle this since it needs to be pretty robust and a wide range of characters could be in there.
Currently, the second field of the $variable ({MD5}BY2I75mxmgp1DPwsTYKi argac7==) is a constant value so using that as a match will work. So for now I'm just looking to remove it if it's there.
In the future it may not be constant and in that case I would need to check if $variable contains two values/fields and remove the second one. The first one is all that is needed.
{MD5}lL5Bo6QZUIcsEuxusMBXM
If not found $variable will contain a single field something like this:
{MD5}lL5Bo6QZUIcsEuxuABCDE
What I'd like to do if the string is found is to remove it so $variable only contains:
{MD5}lL5Bo6QZUIcsEuxusMBXM
Looking for suggestions on the best way to handle this since it needs to be pretty robust and a wide range of characters could be in there.
Currently, the second field of the $variable ({MD5}BY2I75mxmgp1DPwsTYKi
In the future it may not be constant and in that case I would need to check if $variable contains two values/fields and remove the second one. The first one is all that is needed.
elsif ($variable =~ /\{MD5\}BY2I75mxmgp1DPwsTYKiargac7==/i {
print "Variable is $variable\n;
}
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Will the fields always end with == or always start with {MD5}?
Here's code that will work for either (with comments).
Here's code that will work for either (with comments).
#!/usr/local/bin/perl
# always use strict and warnings
use strict;
use warnings;
my $variable = '{MD5}lL5Bo6QZUIcsEuxusMBXMR1YAgCAZF== {MD5}BY2I75mxmgp1DPwsTYKiargac7==';
# you probably want to comment out all of these except one
# if always ending in ==
$variable =~ s{(?<==).*$}{};
# if always starts with {MD5} - make sure it's not the first one
# I couldn't figure out how to get a look-behind to match any char
$variable =~ s{(.){MD5}.*$}{$1};
# if there is always a space between items
$variable =~ s{\s.*$}{};
print $variable, "\n";
ASKER
All options seem to work equally as well, but I'm focusing on
1. The brackets (except those around MD5) are just the delimenters you chose to use instead of the standard /. Not sure why the backets around MD5 do not need to be escaped, but it appears they does not.
2. The
Also, I can remove the $, which I assume is a end of line anchor and still acheive the desired results. I just want to make sure that it will always remove the second field.
This regx works even if the space is removed between the two feilds, which if great, but I don't see how. The explanation that I have found on what (.) means doesn't seem to match how this is working. Thanks
$variable =~ s{(.){MD5}.*$}{$1};
since I think it will always start with {MD5}. I'm confused on how this works and would appreciate a detailed explanation. Heres my take, although most likely inaccurate:1. The brackets (except those around MD5) are just the delimenters you chose to use instead of the standard /. Not sure why the backets around MD5 do not need to be escaped, but it appears they does not.
2. The
(.){MD5}.*${$1}
section has me confused. I'm just not sure why this is working. In the brackets that have $1, I can put {$1} or {} and still get the desired result. I don't understand how this is removing the second field. Also, I can remove the $, which I assume is a end of line anchor and still acheive the desired results. I just want to make sure that it will always remove the second field.
This regx works even if the space is removed between the two feilds, which if great, but I don't see how. The explanation that I have found on what (.) means doesn't seem to match how this is working. Thanks
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
OK. Great explanation. Now that I understand what this does I have added the following to make sure this behaves the same way if by chance $variable has a space in the front for some reason.
sub trim {
$_[0]=~s/^\s+//;
$_[0]=~s/\s+$//;
$_[0]=~s/\s+//g;
return;
.....
}
elsif ($variable =~ /\{MD5\}BY2I75mxmgp1DPwsTYKiargac7==/i {
print "Variable is $variable\n;
trim ($variable);
$variable=~ s{(.){MD5}.*$}{$1};
print "Variable is $variable\n;
}
This seems to work pretty well. It removes any possible leading space so I can be assured the regular expression is not grabbing the wrong field. Also it gets rid of any space in the middle which I do not need and I'm not sure if there could be none, 1 or several spaces. I don't think I'm missing anything?
$_[0]=~s/^\s+//; # removes leading spaces
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Open in new window