Link to home
Start Free TrialLog in
Avatar of schott19
schott19

asked on

Easy regex for someone who knows how to do them!

I have the following input , pattern and replacement:

input = "/suo/west_midlands/birmingham.aspx?"
pattern = "/suo/(\w+)/(\w+).aspx?"
replacement = "$1|$2"

when i run the match i get:

"west_midlands|birmingham?"

which is correct but when i change the underscore to be a dash it does not work!

as what i really want is a match for  :

 "/suo/west-midlands/birmingham.aspx?"   --->    "west-midlands|birmingham?"

Can anyone spot what im missing and if so, could you explain why the pattern i currently have does not work for - but does for _ and also what is the correct pattern.

Thanks in advance

Dave
ASKER CERTIFIED SOLUTION
Avatar of Fernando Soto
Fernando Soto
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hi schott19;

The reason why the pattern you are using is not giving you the results you want is because the \w meta-character has the following meaning:

Match any of the following character, a-z, A-Z, _, and 0-9.

Therefore the character, -, is not a member of the set and is not a match. This is the reason why I replaced the \w with [a-zA-Z_\-0-9]. Also note that the - is escaped in the pattern because when it appears inside [ ] it is a range meta-character.

Fernando
Avatar of schott19
schott19

ASKER

thanks for your help,

Works a treat now... i was almost there but i forgot the + after the [ ]
Not a problem, glad I was able to help. ;=)