Understanding regular expression engine precedence rule in lookahead

I'm reading lookahead construct in regular expression. It seems it has to be more with understanding how regular expression engine precedence rule. Below is the explanation of lookahead, but I'm not exactly understanding when the engine evaluate the ?! and ?= condition. Does the engine goes to all the way to end of Regex expression, then evaluate 'match' or 'fail', then come back to the very begining and reevaluate?
The last example becomes more complicate.
======================================

First, let's see how the engine applies q(?!u) to the string Iraq. The first token in the regex is the literal q. As we already know, this causes the engine to traverse the string until the q in the string is matched. The position in the string is now the void after the string. The next token is the lookahead. The engine takes note that it is inside a lookahead construct now, and begins matching the regex inside the lookahead. So the next token is u. This does not match the void after the string. The engine notes that the regex inside the lookahead failed. Because the lookahead is negative, this means that the lookahead has successfully matched at the current position. At this point, the entire regex has matched, and q is returned as the match.

Let's try applying the same regex to quit. q matches q. The next token is the u inside the lookahead. The next character is the u. These match. The engine advances to the next character: i. However, it is done with the regex inside the lookahead. The engine notes success, and discards the regex match. This causes the engine to step back in the string to u.

Because the lookahead is negative, the successful match inside it causes the lookahead to fail. Since there are no other permutations of this regex, the engine has to start again at the beginning. Since q cannot match anywhere else, the engine reports failure.

Let's take one more look inside, to make sure you understand the implications of the lookahead. Let's apply q(?=u)i to quit. The lookahead is now positive and is followed by another token. Again, q matches q and u matches u. Again, the match from the lookahead must be discarded, so the engine steps back from i in the string to u. The lookahead was successful, so the engine continues with i. But i cannot match u. So this match attempt fails. All remaining attempts fail as well, because there are no more q's in the string.
LVL 1
crcsupportAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

ozoCommented:
Since i never matches u
q(?=u)i can never match anything

On the other hand, q(?=u).i is equivalent to qui
0
crcsupportAuthor Commented:
ozo, so after evaluating (?=u), why token 'i' looks back at 'u' in the string?
0
ozoCommented:
Because (?=u) is a zero-width positive look-ahead assertion
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

crcsupportAuthor Commented:
ozo,
I don't understand what you say. the two of your posts above are not helpful than what I pasted in the original post. That's why I posted the question here.
Can you explain as you explain to someone who started trying to understand what lookahead is? Maybe giving some examples and giving how the engine evaluates with the current token and string position would help.

tnx
0
ozoCommented:
A zero-width assertion consumes no characters, it just asserts whether the match succeeds.

Like the anchor ^, which matches the beginning of the line, but doesn't eat any characters
0
Dan CraciunIT ConsultantCommented:
It's not that complicated, really.

q(?=u)i applied to quit, where | is the current position.
1. |quit: q=q match
2. q|uit: (?=u)=u match
3. q|uit: i=u fail

Basically, after the lookahead the string position does not move.

s(?=o)t applied to storm:
1. |storm: s=s match
2. s|torm: (?=o) match
3. s|torm: t=t match

HTH,
Dan
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
crcsupportAuthor Commented:
Dan,
At #2 step on the 2nd example, (?=o) fails, I guess because next character is t. ??
It should be (?!o)
0
ozoCommented:
(?=o)t  always fails, for the same reason that (?=u)i and \b\B always fail.
Dan may have meant st(?=o)
0
crcsupportAuthor Commented:
I'm amazed, this is very powerful construct, feels like piping+condition.
Thanks!!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Regular Expressions

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.