# Understanding Greedy Quantifier * and +

I ran these arguments given below:

[wow]* "wow its cool"
Output: 0 "wow" 3 "" 4 "" 5 "" 6 "" 7 "" 8 "" 9 "oo" 11 "" 12 ""

[wow]+ "wow its cool"
Output: 0 "wow" 9 "oo"

I know that * is a greedy quantifier which matches as many as it can and must be zero or more, so there are empty strings after "wow". But can you tell me the reason I got "oo" in the output for both + and *?

I find greedy quantifiers hard to understand. I know + means 1 or more, and * mean zero or more, but I am not sure under what situation I am supposed to use each of them?
Can you simplify the difference between them? (Any analogy or examples would helpful)

I used the code below to try those greedy quantifiers:

``````import java.util.regex.*;
public class TestClass
{
public static void main(String[] args)
{
Pattern p = Pattern.compile(args[0]);
Matcher m = p.matcher(args[1]);
boolean b = false;
while(b = m.find())
{
System.out.print(m.start()+" \""+m.group()+"\" ");
}
}
}
``````

I will appreciate your response.
Java

techbro

8/22/2022 - Mon
for_yan

Becuase there "o" inside [] -chacrcter class - and any number of them - so two "oo" will
matcgh - makes sense
for_yan

* means zero so even when nothing is in between - will matcvh, and + requoires sometihing in between
for_yan

so "C*L" pattern wll match "CL"
but "C+L" pattern should not match string "CL"
for_yan

soe [wow]* means that you are looking for the string which is made up of charcaters of "w" or "o" - any number of them,
therefore "oo" should match and "ww" should match or "wo" should match, it probably is not necessary to repeat "w" two times
inside the brackets
for_yan

for_yan

Ok, now I tested some of my statements above:

"C[wow]*L" should match string "CL" as even nothing between "CL" will match  - correct, see below

``````         Pattern p6 = Pattern.compile("C[wow]*L");
Matcher m6 = p6.matcher("CL");
boolean b6 = false;
while(b6 = m6.find())
{
System.out.print("output: " + m6.start()+" \""+m6.group()+"\" ");
}

System.out.println("");
``````

``````output: 0 "CL"
``````
for_yan

"C[wow]+L"  then it should not match string "CL", as it needs more than zero ofg somthing inside the []  - correct, see below

``````           Pattern p6 = Pattern.compile("C[wow]+L");
Matcher m6 = p6.matcher("CL");
boolean b6 = false;
while(b6 = m6.find())
{
System.out.print("output: " + m6.start()+" \""+m6.group()+"\" ");
}

System.out.println("");
``````

No output generated as expected
for_yan

More tests consistent with the statements above

I guess , all seems understandable.
Please, let me know if you still have any doubts.

code:

``````         Pattern p6 = Pattern.compile("[wowwowow]+");
Matcher m6 = p6.matcher("oo");
boolean b6 = false;
while(b6 = m6.find())
{
System.out.print("output: " + m6.start()+" \""+m6.group()+"\" ");
}

System.out.println("");
``````

output:
``````output: 0 "oo"
``````

code:

``````           Pattern p6 = Pattern.compile("[o]+");
Matcher m6 = p6.matcher("oo");
boolean b6 = false;
while(b6 = m6.find())
{
System.out.print("output: " + m6.start()+" \""+m6.group()+"\" ");
}

System.out.println("");
``````

output:
``````output: 0 "oo"
``````

Mick Barry

> But can you tell me the reason I got "oo" in the output for both + and *?

[wow]*

matches 0 or more 'w' or 'o''s.
The other w is actually redundant

Get rid of the square brackets if you want it to just match 'wow'
for_yan

As "[wow]*" does not need any single instance to match,
that's why it matches all these empty strings - frankly don't understand how many
empty strings it matches - I guess, one after each character with the exception of matches
Mick Barry

> but I am not sure under what situation I am supposed to use each of them?

typically when at least a single instance is required then use +
If its optional then *
SOLUTION
Mick Barry

Mick Barry

the example in your question also shows the difference.
When you use the + at least one character is required so the empty strings no longer match
techbro