aish13
asked on
PCRE for URL / Hostname match
Hi -
I am looking for a PCRE expression which can match something like this -
http://[InputExpression] OR https://[InputExpression]
http://[InputExpression]:80 OR https://[InputExpression]:443
http://[InputExpression]/ OR https://[InputExpression]/
http://[InputExpression]:80/ OR https://[InputExpression]:443/
[InputExpression] value could be either "myashish" or an IPAddress.
Basically PCRE should only match the above. If there is anything else in the URL e.g. http://mypgetrain/helloworld, http://mypgetrain/helloworld/cust etc (like a URI at the end) then it should not match. Only if the input value is one of the above URLs then it should create a match.
Any help in this regards will be really appreciated.
Regards
Ashish
I am looking for a PCRE expression which can match something like this -
http://[InputExpression] OR https://[InputExpression]
http://[InputExpression]:80 OR https://[InputExpression]:443
http://[InputExpression]/ OR https://[InputExpression]/
http://[InputExpression]:80/ OR https://[InputExpression]:443/
[InputExpression] value could be either "myashish" or an IPAddress.
Basically PCRE should only match the above. If there is anything else in the URL e.g. http://mypgetrain/helloworld, http://mypgetrain/helloworld/cust etc (like a URI at the end) then it should not match. Only if the input value is one of the above URLs then it should create a match.
Any help in this regards will be really appreciated.
Regards
Ashish
$pattern = "@https?://".preg_quote($i nputExpres sion,"@"). ":(?:80|44 3)?/?@";
if (preg_match($pattern, $url)) {
print "Great";
} else {
print "Blurgh";
}
if (preg_match($pattern, $url)) {
print "Great";
} else {
print "Blurgh";
}
Seeing Adam314's suggestion, you'll want the start and end-of-line placemarkers too:
$pattern = "@^https?://".preg_quote($ inputExpre ssion,"@") .":(?:80|4 43)?/?$@";
$pattern = "@^https?://".preg_quote($
Apologies - there was a mistake in my suggestions. This should do it:
$pattern = "@^https?://".preg_quote($ inputExpre ssion,"@") ."(?:\:(?: 80|443))?/ ?$@";
$pattern = "@^https?://".preg_quote($
ASKER
Hi - Will it be possible for you create a PCRE that would have basically have input expression hardcoded as "mycompanytrain"...I am very new to PCRE and couldn't figure out how to do it.
Regards
Ashish
Regards
Ashish
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
preg_quote escapes any regular special characters that are valid characters in the URL, such as full stops. You'll need to use the preg_quote version if you want full stops in your input expression, unless you manually escape them.
eg if your input expression was google.co.uk, you could use either:
$inputExpression = "google.co.uk"; #just this changed... very easy!
$pattern = "@^https?://".preg_quote($ inputExpre ssion,"@") ."(?:\:(?: 80|443))?/ ?$@";
or the following, manually escaping the full stops (not the best way):
$pattern = "@^https?://google\.co\.uk (?:\:(?:80 |443))?/?$ @";
eg if your input expression was google.co.uk, you could use either:
$inputExpression = "google.co.uk"; #just this changed... very easy!
$pattern = "@^https?://".preg_quote($
or the following, manually escaping the full stops (not the best way):
$pattern = "@^https?://google\.co\.uk
Typo: "regular special characters" was supposed to say "regular expression special characters"
ASKER
Hi - Thanks a lot for the solution. The below URL pattern which you gave brought the following results -
"@^https?://mypgetrain(?:\ :(?:80|443 ))?/?$@"
MATCHES
http://mycompanytrain
http://mycompanytrain/
http://mycompanytrain:80
http://mycompanytrain:80/
https://mycompanytrain
https://mycompanytrain/
https://mycompanytrain:443
https://mycompanytrain:443/
DID NOT MATCH
http://172.21.141.208
http://172.21.141.208/
http://172.21.141.208:80
http://172.21.141.208:80/
https://172.21.141.208
https://172.21.141.208/
https://172.21.141.208:443
https://172.21.141.208:443/
Can you please change the pattern so that it matches the IP address as well?
Regards
Ashish
"@^https?://mypgetrain(?:\
MATCHES
http://mycompanytrain
http://mycompanytrain/
http://mycompanytrain:80
http://mycompanytrain:80/
https://mycompanytrain
https://mycompanytrain/
https://mycompanytrain:443
https://mycompanytrain:443/
DID NOT MATCH
http://172.21.141.208
http://172.21.141.208/
http://172.21.141.208:80
http://172.21.141.208:80/
https://172.21.141.208
https://172.21.141.208/
https://172.21.141.208:443
https://172.21.141.208:443/
Can you please change the pattern so that it matches the IP address as well?
Regards
Ashish
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks a lot for help...
m#^https?://[\w\.]+(:80|:4
If not, and you have to use / as the delimiter, then the / needs to be escaped in the RE:
/^https?:\/\/[\w\.]+(:80|: