Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 307
  • Last Modified:

String Matching

Ok, I want to parse a url. I want to take out an optional "http" an I'm not quite sure how to do it. Is there anyway to group the "http" so that you can add a "?" after it (making it optional)?

Also, I'm having a problem using "*." with matching. Say I have a string which I get from a text datafile "$name=Me&password=mypass&address=myaddress&status=active&hobby=computer design" and I want to change just the address field, but I don't know if the field are always going to be the same in the same order. So I try:

$string =~ s/&address=.*&/&$address=newaddress&/i;

But the .* will make it continue to the last "&". Any suggestions?

Thanks,
OKSD
0
OKSD
Asked:
OKSD
  • 4
  • 2
1 Solution
 
rj2Commented:
#!/usr/bin/perl
$url='http://www.myhost.com/cgi-bin/test.pl';
$url =~ m!(?:http://)?([^/]*)!;
print "Host is $1\n";

$string='name=Me&password=mypass&address=myaddress&status=active&hobby=computer design';
$string =~ s/address=[^&]*/address=newaddress/;
print "$string\n";
0
 
bcladdCommented:
(1) Yes, you can make http optional with a question mark. Just group the part you want in parentheses _or_ in non-capturing parentheses if you are worried about speed. So the following matches the experts exchange url with or without the leading access specifier:

    if ($url =~ m{(http://)?www.experts-exchange.com}) {
      print "We have a match!\n";
    } else {
      print "No match!\n";
    }

(2) Use non-greedy quantification. *? is a non-greedy star (match zero or more but as few as possible). So your expression can be rewritten as
    $name =~ s/&address=.*?&/&address=$newaddress&/i;

Note that there is a problem with your code. You can replace the address so long as it is not the last field in the list of fields (you require the terminating &). It would make more sense to use

    $name =~ s/&address=[^&]*/&address=$newaddress/i;

(this takes care of the overly greedy feature of * so it is safe to use the greedy quantifier again).

Hope this helps, -bcl
0
 
OKSDAuthor Commented:
Well, It looks as if I've got two working answers. I think I'm going to have to give the points to bcladd because of the more descriptive answer. Thanks for the help though, rj2.

And bcladd, I am aware that there will need to be an ending ampersand (&), when I put the full thing together, I always pull some nonsense like null=null at both ends of the entry in the datafile for various other reasons (like line breaks). But, could you explain what [^&]* does? I though that "^" limited a match to the beginning of a string....

Thanks,
OKSD
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
bcladdCommented:
^ has two different meanings in Perl R.E.:

(1) As you say, it is used to anchor a match to the beginning of the line (though the exact meaning of beginning of line can be modified with the /m and /s modifiers on the regular expression).

(2) When ^ appears as the FIRST character in a character class (between []), it means NOT. Thus [^&] is any character that is not & just as [^aeiou] is any character but a vowel (note: not any LETTER). Adding the star matches 0 or more of the preceding regular expression component so the character class of anything but an & is matched 0 or more times.

-bcl
0
 
OKSDAuthor Commented:
All right, I've got one more problem. I'm parsing the URL of a vaiable address that I get from $ENV{'HTTP_REFERER'} and I want to seperate out the domain name and the page. So I type:

#!/usr/bin/perl
$ENV{'HTTP_REFERER'} = "http://somedomain.com/page.html";
#I just set that for testing purposes
$url = $ENV{'HTTP_REFERER'};
$url =~ m{(http://)?([^/])(.*)}i;
$domain = $2;
$page = $3;
print "Content-type: text/html\n\n";
print "Domin is: $domain<p>\nPage is: $page";

and I get:

Domin is: s
Page is: omedomain.com/page.html

Sould you help me with that please?

Thanks!
OKSD
0
 
OKSDAuthor Commented:
All right, I've got one more problem. I'm parsing the URL of a vaiable address that I get from $ENV{'HTTP_REFERER'} and I want to seperate out the domain name and the page. So I type:

#!/usr/bin/perl
$ENV{'HTTP_REFERER'} = "http://somedomain.com/page.html";
#I just set that for testing purposes
$url = $ENV{'HTTP_REFERER'};
$url =~ m{(http://)?([^/])(.*)}i;
$domain = $2;
$page = $3;
print "Content-type: text/html\n\n";
print "Domin is: $domain<p>\nPage is: $page";

and I get:

Domin is: s
Page is: omedomain.com/page.html

Sould you help me with that please?

Thanks!
OKSD
0
 
OKSDAuthor Commented:
Never mind, I got it, I forgot to add the "*" after the brackets, like so:

$url =~ m{(http://)?([^/]*)(.*)}i;

OKSD
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

  • 4
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now