EffinGood
asked on
REGEX Help for Domain Name + Path
Hello,
I am sifting through a page of text and want to use preg_match on it to find urls that match a profile and capture them.
What is the regex I would use to find "http://domain.com/page/name" with:
"domain" could be "domain1" or "mydomain3"
"page/name" being any string with or without slashes, repeated any number of times
the "http" should also look for "https"
there might be a www or no www
Thank you very much!
Thank you!
I am sifting through a page of text and want to use preg_match on it to find urls that match a profile and capture them.
What is the regex I would use to find "http://domain.com/page/name" with:
"domain" could be "domain1" or "mydomain3"
"page/name" being any string with or without slashes, repeated any number of times
the "http" should also look for "https"
there might be a www or no www
Thank you very much!
Thank you!
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Glad I could help!
But I'm afraid I don't feel the same way... :)
But I'm afraid I don't feel the same way... :)
ASKER
That's ok. I understand. We can still be friends. :)
I've been asked on another forum to
1. stop using {0,1} and use the optional operator (?), the reason being that it's easier to read for the "properly" trained regexp specialists.
2. use non capturing groups (?:) when possible, to speed up matches a little (because the regex engine does not need to keep track of groups).
So, you have below the functionally equivalent regex, but a little more "canonically" written:
Dan
1. stop using {0,1} and use the optional operator (?), the reason being that it's easier to read for the "properly" trained regexp specialists.
2. use non capturing groups (?:) when possible, to speed up matches a little (because the regex engine does not need to keep track of groups).
So, you have below the functionally equivalent regex, but a little more "canonically" written:
'%https?://(?:www\.)?(?:domain1|mydomain3)\.com(?:/\w*(?:\.\w*)*)*%'
HTH,Dan
ASKER