• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 203
  • Last Modified:

How do I use regex too find 2 groups in a URL?

I am not a programmer by trade so I did my best but I need help. I have a need to create a rule for Google Tag Manager using regex. My goal is to look at a URL and find two separate group matches in the string. Here is a sample URL

http://123.website.com/?&guid=blahblahblah&page=something&type=abc&adv=abc1234&site={siteID}

I originally had this which worked great if it weren't for the "&guid=blahblahblah&page=something&" in between the two groups. How do I check for those two groups in one expression? Here is what I oginally had:

(http://)(([0-9])|([0-9][0-9])|([0-9][0-9][0-9])).website.com\?(type\=abc)

Bonus: How can I make it check for https as well as http?

Thx!
0
futr_vision
Asked:
futr_vision
  • 11
  • 8
2 Solutions
 
Dan CraciunIT ConsultantCommented:
Assuming you want to capture "123" and "type=abc" from your sample link, this:
https?://(\w+)\.website\.com/.*(type=\w+)

Open in new window

will give you "123" in $1 and "type=abc" in $2.

Bonus: it allows both http and https :)

HTH,
Dan
0
 
futr_visionAuthor Commented:
Cool. I got this answer in another resource.

https?://\d{1,3}\.website\.com/.*type=abc.* (pearl which I am not sure works with Google Tag Manager)

and this one

https?:\/\/([\d]{1,4})\.website\.com\/.*?&type=(.*?)&.*?

The second one works but type needs to be an exact match. I also, after testing found that I need to revise my request. I need the statement to find an exact match for both type= and adv= . So in my example I need to make sure that "type=abc" and "adv=abc1234" match exactly.
0
 
Dan CraciunIT ConsultantCommented:
OK. This:
https?:\/\/(\w+)\.website\.com\/.*type=abc&adv=abc1234

Open in new window

will give you "123" in $1, only if type=abc and adv=abc1234
0
Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

 
futr_visionAuthor Commented:
Hmm. So maybe i am not being completely clear or maybe I am misreading your response.

A number from 1-9999 needs to be in that first spot after the http(s)://
If a number is present and only a number then it has to match the type= and the adv= next.

Is that what your regex does?
0
 
Dan CraciunIT ConsultantCommented:
My regexp will return any letter, digit or _ between "http://" and ".website.com". If you need to restrict it to a number between 1 and 1999, change \w+ to
[1-9]\d{0,3}
0
 
futr_visionAuthor Commented:
Would this work?

https?:\/\/(\d{1,4})\.website\.com\/.*type=abc&adv=abc1234

Actually, I think it will fail if it starts with a "0". I don't forsee that happening but maybe I should be loose in my definition in case that does happen.
0
 
Dan CraciunIT ConsultantCommented:
You need this to make sure the number does not start with 0:
https?:\/\/([1-9]\d{0,3})\.website\.com\/.*type=abc&adv=abc1234

Open in new window

0
 
futr_visionAuthor Commented:
what I am saying is the person creating the landing pages may start the number with a "0". I don't have much control over that. i haven't seen it happen but who knows. I guess I could always alter it if I need too in those instances. Otherwise I think this will work. Now to make sure that Google Tag Manager accepts it :) Looks like pretty standard RegEx so I don't see why not.
0
 
futr_visionAuthor Commented:
Hmm. Just test4ed. I think the ampersand might be breaking it. This is how Google see it

{{url}} matches RegEx https?:\/\/([1-9]\d{0,3})\.website\.com\/.*type=abc&adv=abc1234
0
 
Dan CraciunIT ConsultantCommented:
Then use an omnichar:
https?:\/\/([1-9]\d{0,3})\.website\.com\/.*type=abc\.adv=abc1234
0
 
futr_visionAuthor Commented:
doesn't like me escaping the ampersand either. I'll try you newest method. Thanks!
0
 
Dan CraciunIT ConsultantCommented:
As I said, use an omnichar. The only risk is matching stuff like type=abc,adv=abc1234, which are illegal anyway.
0
 
futr_visionAuthor Commented:
Hmm. That doesn't validate in any of the tools I've tested it with or in google. Looks like all you did was escape a period but then again I don't know much about RegEx so I am probably missing a nuance. I need to be fairly strict about things but it also needs to work :)
What a trial this has been. You'd think this would be easier.
0
 
Dan CraciunIT ConsultantCommented:
Yup, you're right :) The omnichar is a dot. No escape needed.

https?:\/\/([1-9]\d{0,3})\.website\.com\/.*type=abc.adv=abc1234
0
 
futr_visionAuthor Commented:
Cool. Looks like that works and I found out that the other one works as well. The one with the ampersand. One quick question. if I do not care whether or not the URL starts with an http:// or https:// do i just leave the https?:\/\/ off? I'm guessing that is not exactly the solution is it?
0
 
futr_visionAuthor Commented:
Hmmm. I don't think the expression fails if I remove https?:\/\/
0
 
futr_visionAuthor Commented:
As a quick add on to this will this finad page that start with a combination of letters and numbers such as in this URL?

http://ab12.website.com

https?:\/\/(\w+)
0
 
Dan CraciunIT ConsultantCommented:
Quick answer: it will find ab12
0
 
futr_visionAuthor Commented:
perfect
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Introducing Cloud Class® training courses

Tech changes fast. You can learn faster. That’s why we’re bringing professional training courses to Experts Exchange. With a subscription, you can access all the Cloud Class® courses to expand your education, prep for certifications, and get top-notch instructions.

  • 11
  • 8
Tackle projects and never again get stuck behind a technical roadblock.
Join Now