Link to home
Start Free TrialLog in
Avatar of credog
credog

asked on

Perl Remove Portion of String

Given the portion of the perl script below, if the string is found  in $variable the output is this:
{MD5}lL5Bo6QZUIcsEuxusMBXMR1YAgCAZF== {MD5}BY2I75mxmgp1DPwsTYKiargac7==

If not found $variable will contain a single field something  like this:
{MD5}lL5Bo6QZUIcsEuxuABCDEFG1YAgCAZF==

What I'd like to do if the string is found is to remove it so $variable only contains:
{MD5}lL5Bo6QZUIcsEuxusMBXMR1YAgCAZF==

 Looking for suggestions on the best way to handle this since it needs to be pretty robust and a wide range of characters could be in there.  

Currently, the second field of the $variable ({MD5}BY2I75mxmgp1DPwsTYKiargac7==)  is a constant value so using that as a match will work.  So for now I'm just looking to remove it if it's there.

In the future it may not be constant and in that case I would need to check if $variable contains two values/fields and remove the second one.  The first one is all that is needed.

elsif ($variable =~ /\{MD5\}BY2I75mxmgp1DPwsTYKiargac7==/i {
      print "Variable is $variable\n;
}

Open in new window

Avatar of Terry Woods
Terry Woods
Flag of New Zealand image

This appears to work for me:


#!/usr/bin/perl

use strict;
use warnings;

my $variable = "{MD5}lL5Bo6QZUIcsEuxusMBXMR1YAgCAZF== {MD5}BY2I75mxmgp1DPwsTYKiargac7==";
print "Variable before: $variable\n";
$variable =~ s/\s\{MD5\}.*//;
print "Variable after: $variable\n";

OUTPUT:
Variable before: {MD5}lL5Bo6QZUIcsEuxusMBXMR1YAgCAZF== {MD5}BY2I75mxmgp1DPwsTYKiargac7==
Variable after: {MD5}lL5Bo6QZUIcsEuxusMBXMR1YAgCAZF==

Open in new window

SOLUTION
Avatar of Terry Woods
Terry Woods
Flag of New Zealand image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Will the fields always end with == or always start with {MD5}?

Here's code that will work for either (with comments).
#!/usr/local/bin/perl

# always use strict and warnings
use strict;
use warnings;

my $variable = '{MD5}lL5Bo6QZUIcsEuxusMBXMR1YAgCAZF== {MD5}BY2I75mxmgp1DPwsTYKiargac7==';

# you probably want to comment out all of these except one

# if always ending in ==
$variable =~ s{(?<==).*$}{};

# if always starts with {MD5} - make sure it's not the first one
# I couldn't figure out how to get a look-behind to match any char
$variable =~ s{(.){MD5}.*$}{$1};

# if there is always a space between items
$variable =~ s{\s.*$}{};

print $variable, "\n";

Open in new window

Avatar of credog
credog

ASKER

All options seem to work equally as well, but I'm focusing on
$variable =~ s{(.){MD5}.*$}{$1};

Open in new window

since I think it will always start with {MD5}.  I'm confused on how this works and would appreciate a detailed explanation.  Heres my take, although most likely inaccurate:

1. The brackets (except those around MD5) are just the delimenters you chose to use instead of the standard /.  Not sure why the backets around MD5 do not need to be escaped, but it appears they does not.

2. The
(.){MD5}.*${$1}

Open in new window

section has me confused.  I'm just not sure why this is working.  In the brackets that have $1, I can put {$1} or {} and still get the desired result.  I don't understand how this is removing the second field.

Also, I can remove the $, which I assume is a end of line anchor and still acheive the desired results.  I just want to make sure that it will always remove the second field.

This regx works even if the space is removed between the two feilds, which if great, but I don't see how.  The explanation that I have found on what (.) means doesn't seem to match how this is working.  Thanks
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of credog

ASKER

OK.  Great explanation.  Now that I understand what this does I have added the following to make sure this behaves the same way if by chance $variable has a space in the front for some reason.
sub trim {
    $_[0]=~s/^\s+//;
    $_[0]=~s/\s+$//;
    $_[0]=~s/\s+//g;
    return;
.....
}
elsif ($variable =~ /\{MD5\}BY2I75mxmgp1DPwsTYKiargac7==/i {
      print "Variable is $variable\n;
      trim ($variable);
      $variable=~ s{(.){MD5}.*$}{$1};
      print "Variable is $variable\n;
}

Open in new window

This seems to work pretty well.  It removes any possible leading space so I can be assured the regular expression is not grabbing the wrong field. Also it gets rid of any space in the middle which I do not need and I'm not sure if there could be none, 1 or several spaces.  I don't think I'm missing anything?
$_[0]=~s/^\s+//; # removes leading spaces
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial