Link to home
Start Free TrialLog in
Avatar of leakim971
leakim971Flag for Guadeloupe

asked on

regex to match exact number of occurence of a sequence

Hello Experts,

I try to build a regex to match an exact number of occurence of a sequence/word :

For example "abc" and the number of occurence needed "3" and only three, no more no less

So :
abcXYZabcEFGabc match
abcXYZabcEFGabcKLMabc don't match because we find four time abc
abcXYZabcEFG don't match, only two is found
abcabcabc match

I have this : /^([^a]*a[^a]*){3}$/ but it work because I'm looking for "a" and not for a sequence "ab", "JPQ"

Thanks for your help!

Kinds Regards.
SOLUTION
Avatar of Patrick Matthews
Patrick Matthews
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of leakim971

ASKER

Thanks a lot for your reply.
The language is Javascript and It sound stupid but I need to use a regexp.

Something like : /^([^a]*a[^a]*){3}$/.test("abcXYZabcEFGabc") will return true or false



Why must it be RegExp?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Why a regex ? I can use it as a parameter for example.
@phr0ze and @sentner thank you I'm going to try your propositions.
@phr0ze

The problem with your pattern is that the *? on the bracket expression will consume any trailing "abc" strings after the first three. Simply adding the negative lookahead does not overcome this.
I believe the following satisfies the requirement:
^(?=(?:(?:[^a]|a[^ab])*abc){3}(?!.*?abc)).*$

Open in new window

kaufmed this one don't match : affkaabcsjfjabckfsofabc
affk a abc sjfj abc kfsof abc

It seems it did not upon further testing. Revised:
^(?=(?:(?:[^a]|a[^b]|a(?=a))*abc){3}(?!.*?abc)).*$

Open in new window

Yeah, the double-a killed it. The modified version has this corrected.
Don't match with : abffkabcsjfjabckfsofabc

ab ffk abc sjfj abc kfsof abc
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
@kaufmed, yes it work this time, thanks. The "problem" with this regexp is the way we build it dynamically.
The original sequence need to have three characters for example : abc

Anyway I want to thank you a lot for your time. I'm going to use an array of regex+boolean and use the logic of @sentner

I'm going to close the question
Thank you to everyone for your suggestions, have a great weekend!
End of week, enjoy, have a nice weekend!
Thanks a lot to everyone! Have fun in your life!
I figured I'd throw this out there just in case, but the string shouldn't be that difficult to build dynamically; granted I don't know what your environment is like. Here is a function I created to build the pattern string:
function buildPattern(src)
{
	var basePattern = "^(?=(?:(?:@@@)*###){3}(?!.*?###)).*$";
	var temp = "";
	var t = src.charAt(0);
	
	for (i = 1; i < src.length; i++)
	{
		temp += "|" + t + "[^" + src.charAt(i) + "]";
		t += src.charAt(i);
	}

	temp = src.charAt(0) + "(?=" + src.charAt(0) + ")" + temp;
	
	return basePattern.replace(/###/g, src).replace(/@@@/, temp);
}

Open in new window

Of course, make sure your source strings don't contain the sequences "###" or "@@@"   :)
Thanks a lot @kaufmed!
Sure. Glad to help!