futr_vision
asked on
How do I use regex too find 2 groups in a URL?
I am not a programmer by trade so I did my best but I need help. I have a need to create a rule for Google Tag Manager using regex. My goal is to look at a URL and find two separate group matches in the string. Here is a sample URL
http://123.website.com/?&guid=blahblahblah&page=something&type=abc&adv=abc1234&site={siteID}
I originally had this which worked great if it weren't for the "&guid=blahblahblah&page=s omething&" in between the two groups. How do I check for those two groups in one expression? Here is what I oginally had:
(http://)(([0-9])|([0-9][0-9])|([0-9 ][0-9][0-9 ])).websit e.com\?(ty pe\=abc)
Bonus: How can I make it check for https as well as http?
Thx!
http://123.website.com/?&guid=blahblahblah&page=something&type=abc&adv=abc1234&site={siteID}
I originally had this which worked great if it weren't for the "&guid=blahblahblah&page=s
(http://)(([0-9])|([0-9][0-9])|([0-9
Bonus: How can I make it check for https as well as http?
Thx!
ASKER
Cool. I got this answer in another resource.
https?://\d{1,3}\.website\ .com/.*typ e=abc.* (pearl which I am not sure works with Google Tag Manager)
and this one
https?:\/\/([\d]{1,4})\.we bsite\.com \/.*?&type =(.*?)&.*?
The second one works but type needs to be an exact match. I also, after testing found that I need to revise my request. I need the statement to find an exact match for both type= and adv= . So in my example I need to make sure that "type=abc" and "adv=abc1234" match exactly.
https?://\d{1,3}\.website\
and this one
https?:\/\/([\d]{1,4})\.we
The second one works but type needs to be an exact match. I also, after testing found that I need to revise my request. I need the statement to find an exact match for both type= and adv= . So in my example I need to make sure that "type=abc" and "adv=abc1234" match exactly.
OK. This:
https?:\/\/(\w+)\.website\.com\/.*type=abc&adv=abc1234
will give you "123" in $1, only if type=abc and adv=abc1234
ASKER
Hmm. So maybe i am not being completely clear or maybe I am misreading your response.
A number from 1-9999 needs to be in that first spot after the http(s)://
If a number is present and only a number then it has to match the type= and the adv= next.
Is that what your regex does?
A number from 1-9999 needs to be in that first spot after the http(s)://
If a number is present and only a number then it has to match the type= and the adv= next.
Is that what your regex does?
My regexp will return any letter, digit or _ between "http://" and ".website.com". If you need to restrict it to a number between 1 and 1999, change \w+ to
[1-9]\d{0,3}
[1-9]\d{0,3}
ASKER
Would this work?
https?:\/\/(\d{1,4})\.webs ite\.com\/ .*type=abc &adv=abc12 34
Actually, I think it will fail if it starts with a "0". I don't forsee that happening but maybe I should be loose in my definition in case that does happen.
https?:\/\/(\d{1,4})\.webs
Actually, I think it will fail if it starts with a "0". I don't forsee that happening but maybe I should be loose in my definition in case that does happen.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
what I am saying is the person creating the landing pages may start the number with a "0". I don't have much control over that. i haven't seen it happen but who knows. I guess I could always alter it if I need too in those instances. Otherwise I think this will work. Now to make sure that Google Tag Manager accepts it :) Looks like pretty standard RegEx so I don't see why not.
ASKER
Hmm. Just test4ed. I think the ampersand might be breaking it. This is how Google see it
{{url}} matches RegEx https?:\/\/([1-9]\d{0,3})\ .website\. com\/.*typ e=abc& adv=abc123 4
{{url}} matches RegEx https?:\/\/([1-9]\d{0,3})\
Then use an omnichar:
https?:\/\/([1-9]\d{0,3})\ .website\. com\/.*typ e=abc\.adv =abc1234
https?:\/\/([1-9]\d{0,3})\
ASKER
doesn't like me escaping the ampersand either. I'll try you newest method. Thanks!
As I said, use an omnichar. The only risk is matching stuff like type=abc,adv=abc1234, which are illegal anyway.
ASKER
Hmm. That doesn't validate in any of the tools I've tested it with or in google. Looks like all you did was escape a period but then again I don't know much about RegEx so I am probably missing a nuance. I need to be fairly strict about things but it also needs to work :)
What a trial this has been. You'd think this would be easier.
What a trial this has been. You'd think this would be easier.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
ASKER
Hmmm. I don't think the expression fails if I remove https?:\/\/
ASKER
As a quick add on to this will this finad page that start with a combination of letters and numbers such as in this URL?
http://ab12.website.com
https?:\/\/(\w+)
http://ab12.website.com
https?:\/\/(\w+)
Quick answer: it will find ab12
ASKER
perfect
Open in new window
will give you "123" in $1 and "type=abc" in $2.Bonus: it allows both http and https :)
HTH,
Dan