TheJase
asked on
Regular Expressions: Can't find multiple backreferences
Why, in JavaScript, or probably any other language, does the backreference below:
/^(\d+?)(?:(?: *, *)(\d+?)){1,}?$/
only return these backreferences:
$1 = 123
$2 = 789
when supplied with this data?
123,456,789
I am trying to make it return:
$1 = 123
$2 = 456
$3 = 789
but take into account that the returned values can be more than just that. That sample data is JUST an example. This is being used to parse input data on a web form.
/^(\d+?)(?:(?: *, *)(\d+?)){1,}?$/
only return these backreferences:
$1 = 123
$2 = 789
when supplied with this data?
123,456,789
I am trying to make it return:
$1 = 123
$2 = 456
$3 = 789
but take into account that the returned values can be more than just that. That sample data is JUST an example. This is being used to parse input data on a web form.
ASKER
I'm trying to parse a set of comma seperated values (numeric only). In the RegExp above, you'll notice that this one allows for a total of 2+ csv's. So, the given value to parse could be anywhere from "value1, value2" to "value1, value2..., value(n)"
What I want to do is parse EACH csv into backreferences. However, the only backreferences I get are the first value (I understand how), and only the LAST value (not what I expected).
Thanks,
TheJase
What I want to do is parse EACH csv into backreferences. However, the only backreferences I get are the first value (I understand how), and only the LAST value (not what I expected).
Thanks,
TheJase
I'm tempted to ask why bother using back-references at all, why not simply use the split method?
e.g.
var re = /\s*,\s*/;
var str = "123,456,789";
var arr = str.split(re)
alert(arr[0] + " and " + arr[1] + and " + arr[2]);
arr now contains all the numberic values, delimited by commas (ignoring any whitespace)
webauk
e.g.
var re = /\s*,\s*/;
var str = "123,456,789";
var arr = str.split(re)
alert(arr[0] + " and " + arr[1] + and " + arr[2]);
arr now contains all the numberic values, delimited by commas (ignoring any whitespace)
webauk
ASKER
I totally understand what you're saying, but I'm building a form validation applet that will iterate through forms fields, looking for attributes such as [validate="/^(\d){5}$/"] and use that value to validate the field, and possibly parse the backreferences to variables if needed.
Thanks,
TheJase
Thanks,
TheJase
Now that puts a whole different light on it :-)
"a form validation applet " - that sounds very handy. I could make use of that here (hint, hint)
I'll have another look at this, but I'm in a different office and won't have much time today. Nevertheless it's a puzzle to be solved :-)
webauk
"a form validation applet " - that sounds very handy. I could make use of that here (hint, hint)
I'll have another look at this, but I'm in a different office and won't have much time today. Nevertheless it's a puzzle to be solved :-)
webauk
OK I thibnk I'm guilty of "not seeing the wood for the trees" :-)
In a regex if you put things into round brackets the patterns matched will go into "regular expression memory" or become a "back reference" and become available as $1, $2, $3...
However, if you use (?: ) then you are telling the reg ex engine NOT to store the contents as a backreference. In your reg ex:
/^(\d+?)(?:(?: *, *)(\d+?)){1,}?$/
You have only two sets of round brackets without ?: therefore only two matches will go into regex memory / backreferences.
I must confess to being puzzled by your use of (?:(?: ) ) what's the thinking there?
webauk
In a regex if you put things into round brackets the patterns matched will go into "regular expression memory" or become a "back reference" and become available as $1, $2, $3...
However, if you use (?: ) then you are telling the reg ex engine NOT to store the contents as a backreference. In your reg ex:
/^(\d+?)(?:(?: *, *)(\d+?)){1,}?$/
You have only two sets of round brackets without ?: therefore only two matches will go into regex memory / backreferences.
I must confess to being puzzled by your use of (?:(?: ) ) what's the thinking there?
webauk
ASKER
It is very possible that I might not be using correct syntax...
But, notice the {1,} which should mean between 1 to unlimited times. So, I figured if I had [123,456,789] as data, the first (\d+?) would catch 123, and the second (\d+?) would catch 456 and then 789 since it has a quantifier outside the group.
Thanks for your ongoing diligence to help me out.
TheJase
But, notice the {1,} which should mean between 1 to unlimited times. So, I figured if I had [123,456,789] as data, the first (\d+?) would catch 123, and the second (\d+?) would catch 456 and then 789 since it has a quantifier outside the group.
Thanks for your ongoing diligence to help me out.
TheJase
Each set of capturing () corresponds to one backreference.
You have two (\d+?), so that returns $1 and $2.
If a (\d+) matches multiple times, the backreference is left with the value of the last match.
If you don't want to use a split, you might try something like
/(\d+)/g
You have two (\d+?), so that returns $1 and $2.
If a (\d+) matches multiple times, the backreference is left with the value of the last match.
If you don't want to use a split, you might try something like
/(\d+)/g
ASKER
Wow, nice and easy, ozo. But, it doesn't enable me to check to see that at least two sets of values have been input:
/^(\d+?)(?#BETWEEN HERE AND...)(?:(?: *, *)(\d+?)){1,}?(?#...HERE)$ /
I understand that I could count how many times it was matched via JavaScript, but I need it to be able to be written into the regex.
Is there NO way to capture each backreference in a quantified group?
Thanks,
TheJase
/^(\d+?)(?#BETWEEN HERE AND...)(?:(?: *, *)(\d+?)){1,}?(?#...HERE)$
I understand that I could count how many times it was matched via JavaScript, but I need it to be able to be written into the regex.
Is there NO way to capture each backreference in a quantified group?
Thanks,
TheJase
How about this....
I dont use regexp in java so you have to pad the message yourself...
This should allow you to cath the 1st and last number group in the string. Also it will only match at least 2 groups only. I am not 100 % sure if you need to escape commas...
(\d+)(,\d+)+
I dont use regexp in java so you have to pad the message yourself...
This should allow you to cath the 1st and last number group in the string. Also it will only match at least 2 groups only. I am not 100 % sure if you need to escape commas...
(\d+)(,\d+)+
ASKER
Actually, rdrunner, I need to catch ALL of the groups, but have the requirement of no less than 2 values.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
P.s. The 2nd submatch of each match will contain the number you are looking for...
ASKER
Although I couldn't catch them in backreferences, at least it accomplishes the task. I'm a pragmatist, so I'm always up to compromise :). Thanks, rdrunner!
Glad you got it working....
I really like regexpes but i dont know them (very) well...
I really like regexpes but i dont know them (very) well...
What is the format of the data you're trying to parse? is it simply three sets of comma-delimited digits? If so, there are easier regexs you can use. If not there are probably still easier regexs you can use :-) Tell us what it is you're trying to parse.
Webauk.