How do I use regex too find 2 groups in a URL?

I am not a programmer by trade so I did my best but I need help. I have a need to create a rule for Google Tag Manager using regex. My goal is to look at a URL and find two separate group matches in the string. Here is a sample URL

http://123.website.com/?&guid=blahblahblah&page=something&type=abc&adv=abc1234&site={siteID}

I originally had this which worked great if it weren't for the "&guid=blahblahblah&page=something&" in between the two groups. How do I check for those two groups in one expression? Here is what I oginally had:

(http://)(([0-9])|([0-9][0-9])|([0-9][0-9][0-9])).website.com\?(type\=abc)

Bonus: How can I make it check for https as well as http?

Thx!
futr_visionAsked:
Who is Participating?

Improve company productivity with a Business Account.Sign Up

x
 
Dan CraciunConnect With a Mentor IT ConsultantCommented:
You need this to make sure the number does not start with 0:
https?:\/\/([1-9]\d{0,3})\.website\.com\/.*type=abc&adv=abc1234

Open in new window

0
 
Dan CraciunIT ConsultantCommented:
Assuming you want to capture "123" and "type=abc" from your sample link, this:
https?://(\w+)\.website\.com/.*(type=\w+)

Open in new window

will give you "123" in $1 and "type=abc" in $2.

Bonus: it allows both http and https :)

HTH,
Dan
0
 
futr_visionAuthor Commented:
Cool. I got this answer in another resource.

https?://\d{1,3}\.website\.com/.*type=abc.* (pearl which I am not sure works with Google Tag Manager)

and this one

https?:\/\/([\d]{1,4})\.website\.com\/.*?&type=(.*?)&.*?

The second one works but type needs to be an exact match. I also, after testing found that I need to revise my request. I need the statement to find an exact match for both type= and adv= . So in my example I need to make sure that "type=abc" and "adv=abc1234" match exactly.
0
Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
Dan CraciunIT ConsultantCommented:
OK. This:
https?:\/\/(\w+)\.website\.com\/.*type=abc&adv=abc1234

Open in new window

will give you "123" in $1, only if type=abc and adv=abc1234
0
 
futr_visionAuthor Commented:
Hmm. So maybe i am not being completely clear or maybe I am misreading your response.

A number from 1-9999 needs to be in that first spot after the http(s)://
If a number is present and only a number then it has to match the type= and the adv= next.

Is that what your regex does?
0
 
Dan CraciunIT ConsultantCommented:
My regexp will return any letter, digit or _ between "http://" and ".website.com". If you need to restrict it to a number between 1 and 1999, change \w+ to
[1-9]\d{0,3}
0
 
futr_visionAuthor Commented:
Would this work?

https?:\/\/(\d{1,4})\.website\.com\/.*type=abc&adv=abc1234

Actually, I think it will fail if it starts with a "0". I don't forsee that happening but maybe I should be loose in my definition in case that does happen.
0
 
futr_visionAuthor Commented:
what I am saying is the person creating the landing pages may start the number with a "0". I don't have much control over that. i haven't seen it happen but who knows. I guess I could always alter it if I need too in those instances. Otherwise I think this will work. Now to make sure that Google Tag Manager accepts it :) Looks like pretty standard RegEx so I don't see why not.
0
 
futr_visionAuthor Commented:
Hmm. Just test4ed. I think the ampersand might be breaking it. This is how Google see it

{{url}} matches RegEx https?:\/\/([1-9]\d{0,3})\.website\.com\/.*type=abc&adv=abc1234
0
 
Dan CraciunIT ConsultantCommented:
Then use an omnichar:
https?:\/\/([1-9]\d{0,3})\.website\.com\/.*type=abc\.adv=abc1234
0
 
futr_visionAuthor Commented:
doesn't like me escaping the ampersand either. I'll try you newest method. Thanks!
0
 
Dan CraciunIT ConsultantCommented:
As I said, use an omnichar. The only risk is matching stuff like type=abc,adv=abc1234, which are illegal anyway.
0
 
futr_visionAuthor Commented:
Hmm. That doesn't validate in any of the tools I've tested it with or in google. Looks like all you did was escape a period but then again I don't know much about RegEx so I am probably missing a nuance. I need to be fairly strict about things but it also needs to work :)
What a trial this has been. You'd think this would be easier.
0
 
Dan CraciunConnect With a Mentor IT ConsultantCommented:
Yup, you're right :) The omnichar is a dot. No escape needed.

https?:\/\/([1-9]\d{0,3})\.website\.com\/.*type=abc.adv=abc1234
0
 
futr_visionAuthor Commented:
Cool. Looks like that works and I found out that the other one works as well. The one with the ampersand. One quick question. if I do not care whether or not the URL starts with an http:// or https:// do i just leave the https?:\/\/ off? I'm guessing that is not exactly the solution is it?
0
 
futr_visionAuthor Commented:
Hmmm. I don't think the expression fails if I remove https?:\/\/
0
 
futr_visionAuthor Commented:
As a quick add on to this will this finad page that start with a combination of letters and numbers such as in this URL?

http://ab12.website.com

https?:\/\/(\w+)
0
 
Dan CraciunIT ConsultantCommented:
Quick answer: it will find ab12
0
 
futr_visionAuthor Commented:
perfect
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.