Using regular expressions, am I getting a successful match but no matching string.

I am using the below code to loop through an expression as long as two terms are being multiplied together or divided.
Dim rgx As Regex = Nothing
Dim m As Match
Dim Expression as String="(a)(b)(a)"

Do
     'Code to manipulate Expression.
     'Expression now equals "a<sup>2</sup>b"
     rgx = New Regex("([\-]?[0-9]+(?:\.[0<wbr ></wbr>-9]*)*)?(<<wbr ></wbr>sup>([\-]?<wbr ></wbr>[0-9]*)</s<wbr ></wbr>up>)?(<sup<wbr ></wbr>>E[-|+][0-<wbr ></wbr>9]*</sup>)<wbr ></wbr>?(([\-]?[a<wbr ></wbr>-z](<sup>[<wbr ></wbr>\-]?[0-9]*<wbr ></wbr></sup>)?)*<wbr ></wbr>)(?=[*|/])<wbr ></wbr>")
     m = rgx.Match(Expression)
Loop While m.Success

Open in new window

The resulting expression (a<sup>2</sup>b) is correct and the pattern should no longer match anything in the string.  When I look at the value of m it equals {} but for some reason m.Success=true.

This is causes the Do Loop to continue but the string is no longer changed by the code, causing an infinite loop.

I have tested the pattern and expression string using www.regexr.com/v1 and I do not get a match (as it should be).    I am not new to regular expressions but have never seen this before.

Can someone please explain this and give me some suggestions on how I resolve this issue?
NevSoFlyAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

louisfrCommented:
There is a problem with [0<wbr ></wbr>-9]
What do you expect it to match? As it is, that's not a valid character class, since '>' is greater than '9'.
0
NevSoFlyAuthor Commented:
I'm sorry I did a cut and paste.  <wbr></wbr> is not in my original code and I don't know where it came from.  The code should have been as follows.
Dim rgx As Regex = Nothing
Dim m As Match
Dim Expression as String="(a)(b)(a)"

Do
     'Code to manipulate Expression.
     'Expression now equals "a<sup>2</sup>b"
     rgx = New Regex("([\-]?[0-9]+(?:\.[0-9]*)*)?(<sup>([\-]?[0-9]*)</sup>)?(<sup>E[-|+][0-9]*</sup>)?(([\-]?[a-z](<sup>[\-]?[0-9]*</sup>)?)*)(?=[*|/])")
     m = rgx.Match(Expression)
Loop While m.Success

Open in new window

0
louisfrCommented:
Except for (?=[*|/]) each part of your regex is optional.
The regex succeeds with the empty string located between < and /
Check m.Index and m.Length. They should be respectively 8 and 0.
0
Cloud Class® Course: MCSA MCSE Windows Server 2012

This course teaches how to install and configure Windows Server 2012 R2.  It is the first step on your path to becoming a Microsoft Certified Solutions Expert (MCSE).

NevSoFlyAuthor Commented:
I checked and your right m.index=8 and m.length=0.  The reason every part of my regex is optional is that every part might or might not be there but at least one of them has to be there.

But why is it when I test it in www.regexr.com/v1 I do not get a match at all?

Do you have any suggestions on how I can change the pattern to fit my needs?
0
NevSoFlyAuthor Commented:
I made some slight alterations to overcome some short cummings that I noticed in my pattern.  Here is my breakdown:

([\-]?[0-9]+(?:\.[0-9]*)*)                                                                                Coefficient
?(<sup>([\-]?[0-9]*)</sup>)                                                                         Exponent
?(<sup>E[-|+][0-9]*</sup>)                                                                        Sci-Notation
?(([\-]?[a-z]((<sup>[\-]?[0-9]*</sup>)|(<sup>E[-|+][0-9]*</sup>))?)*)  Variable w/ Exponent or Sci-Notation.

What I need is a pattern that matches a string (term in this case) that must have either a coefficient or a variable.  

If it has a coefficient then the coefficient may have either an exponent, scientific notation or neither.  

If it has a variable then the variable may have either an exponent, scientific notation or neither.
0
louisfrCommented:
The regexr site forbids regexes which can match 0 characters. The problem is explicitly indicated if you enter the expression on the home page http://www.regexr.com/ and hover over the "Infinite" red button : "The expression can match 0 characters, and therefore matches infinitely".

Exponent OR scientifif notation but not both? You're allowing exponent, followed by sc.not. on the coefficient, but only one of them on the variable.

The coefficient is an optional minus sign, then series of digits and dots? This is allowed: -1...5.23..4

The variable part can be this: a<sup></sup>

Here is a modified version of your regex. I changed the coefficient to be a number with optional decimal part. You can change it back if you want. I match either a mandatory coefficient followed by an optional variable OR a mandatory variable, ensuring that at least one of them matches:
(-?\d+(?:\.\d*)?)(<sup>(-?\d+)</sup>)?(<sup>E[-+]\d+</sup>)?(-?[a-z]((<sup>-?\d+</sup>)|(<sup>E[-+]\d+</sup>)?))?
|
(-?[a-z]((<sup>-?\d+</sup>)|(<sup>E[-+]\d+</sup>)?))
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
NevSoFlyAuthor Commented:
First, thank you very much for your time.   I tested the regex and everything seems to work.  

I do have one problem.  There are some instances where the expression that I am using will have an exponent or scientific notation outside a set of parenthesis. ex. (2a<sup>2</sup>*3b)<sup>4</sup>.  In this case the regex you supplied matches each alphanumeric character separately in <sup>4</sup>.  I would not want any part of <sup>4</sup> to match at all.
0
louisfrCommented:
You can test if the match is being made against a tag, or against something inside a tag with

a look-ahead expression
(add here the matching pattern)(?![^<]*</|[a-z]*>)

Open in new window


or a look-behind expression
(?<!</?[a-z]*|<[a-z]+>[^<]*)(add here the matching pattern)

Open in new window

0
NevSoFlyAuthor Commented:
thx once more.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Regular Expressions

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.