asked on

String Matching

Ok, I want to parse a url. I want to take out an optional "http" an I'm not quite sure how to do it. Is there anyway to group the "http" so that you can add a "?" after it (making it optional)?

Also, I'm having a problem using "*." with matching. Say I have a string which I get from a text datafile "$name=Me&password=mypass&address=myaddress&status=active&hobby=computer design" and I want to change just the address field, but I don't know if the field are always going to be the same in the same order. So I try:

$string =~ s/&address=.*&/&$address=newaddress&/i;

But the .* will make it continue to the last "&". Any suggestions?

Thanks,
OKSD

rj2

#!/usr/bin/perl
$url='http://www.myhost.com/cgi-bin/test.pl';
$url =~ m!(?:http://)?([^/]*)!;
print "Host is $1\n";

$string='name=Me&password=mypass&address=myaddress&status=active&hobby=computer design';
$string =~ s/address=[^&]*/address=newaddress/;
print "$string\n";

ASKER CERTIFIED SOLUTION

bcladd

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

OKSD

ASKER

Well, It looks as if I've got two working answers. I think I'm going to have to give the points to bcladd because of the more descriptive answer. Thanks for the help though, rj2.

And bcladd, I am aware that there will need to be an ending ampersand (&), when I put the full thing together, I always pull some nonsense like null=null at both ends of the entry in the datafile for various other reasons (like line breaks). But, could you explain what [^&]* does? I though that "^" limited a match to the beginning of a string....

Thanks,
OKSD

bcladd

^ has two different meanings in Perl R.E.:

(1) As you say, it is used to anchor a match to the beginning of the line (though the exact meaning of beginning of line can be modified with the /m and /s modifiers on the regular expression).

(2) When ^ appears as the FIRST character in a character class (between []), it means NOT. Thus [^&] is any character that is not & just as [^aeiou] is any character but a vowel (note: not any LETTER). Adding the star matches 0 or more of the preceding regular expression component so the character class of anything but an & is matched 0 or more times.

-bcl

OKSD

ASKER

All right, I've got one more problem. I'm parsing the URL of a vaiable address that I get from $ENV{'HTTP_REFERER'} and I want to seperate out the domain name and the page. So I type:

#!/usr/bin/perl
$ENV{'HTTP_REFERER'} = "http://somedomain.com/page.html";
#I just set that for testing purposes
$url = $ENV{'HTTP_REFERER'};
$url =~ m{(http://)?([^/])(.*)}i;
$domain = $2;
$page = $3;
print "Content-type: text/html\n\n";
print "Domin is: $domain<p>\nPage is: $page";

and I get:

Domin is: s
Page is: omedomain.com/page.html

Sould you help me with that please?

Thanks!
OKSD

OKSD

ASKER

OKSD

ASKER

Never mind, I got it, I forgot to add the "*" after the brackets, like so:

$url =~ m{(http://)?([^/]*)(.*)}i;

OKSD