[Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 272
  • Last Modified:

Regular Expressions: Can't find multiple backreferences

Why, in JavaScript, or probably any other language, does the backreference below:

/^(\d+?)(?:(?: *, *)(\d+?)){1,}?$/

only return these backreferences:

$1 = 123
$2 = 789

when supplied with this data?

123,456,789


I am trying to make it return:

$1 = 123
$2 = 456
$3 = 789

but take into account that the returned values can be more than just that.  That sample data is JUST an example.  This is being used to parse input data on a web form.
0
TheJase
Asked:
TheJase
  • 6
  • 4
  • 4
  • +1
1 Solution
 
webaukCommented:
Just a quick point - Javasdcript in Internet Explorer 5 does not support non-greedy matches. (e.g. \d+? ) this only came in with Internet Explorer 5.5

What is the format of the data you're trying to parse? is it simply three sets of comma-delimited digits? If so, there are easier regexs you can use. If not there are probably still easier regexs you can use :-) Tell us what it is you're trying to parse.

Webauk.
0
 
TheJaseAuthor Commented:
I'm trying to parse a set of comma seperated values (numeric only).  In the RegExp above, you'll notice that this one allows for a total of 2+ csv's. So, the given value to parse could be anywhere from "value1, value2" to "value1, value2..., value(n)"

What I want to do is parse EACH csv into backreferences.  However, the only backreferences I get are the first value (I understand how), and only the LAST value (not what I expected).

Thanks,
TheJase
0
 
webaukCommented:
I'm tempted to ask why bother using back-references at all, why not simply use the split method?

e.g.

var re = /\s*,\s*/;
var str = "123,456,789";
var arr = str.split(re)
alert(arr[0] + " and " + arr[1] +  and " + arr[2]);

arr now contains all the numberic values, delimited by commas (ignoring any whitespace)

webauk
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
TheJaseAuthor Commented:
I totally understand what you're saying, but I'm building a form validation applet that will iterate through forms fields, looking for attributes such as [validate="/^(\d){5}$/"] and use that value to validate the field, and possibly parse the backreferences to variables if needed.

Thanks,
TheJase
0
 
webaukCommented:
Now that puts a whole different light on it :-)

"a form validation applet " - that sounds very handy. I could make use of that here (hint, hint)

I'll have another look at this, but I'm in a different office and won't have much time today. Nevertheless it's a puzzle to be solved :-)

webauk
0
 
webaukCommented:
OK I thibnk I'm guilty of "not seeing the wood for the trees" :-)

In a regex if you put things into round brackets the patterns matched will go into "regular expression memory" or become a "back reference" and become available as $1, $2, $3...

However, if you use (?:  ) then you are telling the reg ex engine NOT to store the contents as a backreference. In your reg ex:

/^(\d+?)(?:(?: *, *)(\d+?)){1,}?$/

You have only two sets of round brackets without ?: therefore only two matches will go into regex memory / backreferences.

I must confess to being puzzled by your use of (?:(?:  ) ) what's the thinking there?

webauk
0
 
TheJaseAuthor Commented:
It is very possible that I might not be using correct syntax...

But, notice the {1,} which should mean between 1 to unlimited times.  So, I figured if I had [123,456,789] as data, the first (\d+?) would catch 123, and the second (\d+?) would catch 456 and then 789 since it has a quantifier outside the group.

Thanks for your ongoing diligence to help me out.

TheJase
0
 
ozoCommented:
Each set of capturing () corresponds to one backreference.
You have two (\d+?), so that returns $1 and $2.
If a (\d+) matches multiple times, the backreference is left with the value of the last match.
If you don't want to use a split, you might try something like
/(\d+)/g
0
 
TheJaseAuthor Commented:
Wow, nice and easy, ozo.  But, it doesn't enable me to check to see that at least two sets of values have been input:
/^(\d+?)(?#BETWEEN HERE AND...)(?:(?: *, *)(\d+?)){1,}?(?#...HERE)$/

I understand that I could count how many times it was matched via JavaScript, but I need it to be able to be written into the regex.

Is there NO way to capture each backreference in a quantified group?

Thanks,
TheJase
0
 
rdrunnerCommented:
How about this....

I dont use regexp in java so you have to pad the message yourself...

This should allow you to cath the 1st and last number group in the string. Also it will only match at least 2 groups only. I am not 100 % sure if you need to escape commas...

(\d+)(,\d+)+
0
 
TheJaseAuthor Commented:
Actually, rdrunner, I need to catch ALL of the groups, but have the requirement of no less than 2 values.
0
 
rdrunnerCommented:
why dont you stick to a simple match then?

It wont allow you to catch them all as backreferences, but it will allow you to catch them as a single match per number. So the matches collection will hold all numbers. you will not get them as backreferences but you will get them...

(,?(\d+)){2,}
0
 
rdrunnerCommented:
P.s. The 2nd submatch of each match will contain the number you are looking for...
0
 
TheJaseAuthor Commented:
Although I couldn't catch them in backreferences, at least it accomplishes the task.  I'm a pragmatist, so I'm always up to compromise :).  Thanks, rdrunner!
0
 
rdrunnerCommented:
Glad you got it working....

I really like regexpes but i dont know them (very) well...
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

  • 6
  • 4
  • 4
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now