Link to home
Start Free TrialLog in
Avatar of John Carney
John CarneyFlag for United States of America

asked on

RegEx in Excel that will match a string of numbers between x and y characters

I'm trying to learn more about RegEx and I'm starting by simply modifying a working code that I got from some of the experts here with a new pattern. Please take a look at my code and workbook and let me know what the correct code would be for this pattern.

Thanks,
John

========
Prior related question:
https://www.experts-exchange.com/questions/27244735/RegEx-in-Excel-that-will-match-a-string-of-numbers-between-x-and-y-characters.html
RegExWorkshop1.xls
Avatar of Patrick Matthews
Patrick Matthews
Flag of United States of America image

John,

Please try describing with sentences what you are trying to accomplish with the pattern.

Patrick
Avatar of John Carney

ASKER

Hi Patrick, thanks for responding so fast.

Basically, I want D5 (=checknum(C5)) to return something (as opposed to the error it's producing). In my revised code below I think I'm looking for "hello" or "goodbye", I'm not sure which.

Does that explain it well enough?

Thanks,
John






Function CheckNum(InputCell As Range) As String
    Dim RegEx, RegM, RegMC
    Dim MyDic
    Dim tmpStr As String
    Dim i As Long
    Set MyDic = CreateObject("scripting.dictionary")
    Set RegEx = CreateObject("vbscript.regexp")
    With RegEx
        .Pattern = "\b(?:\d[ -]*?){13,16}\b"
        '.Global = True
        '.IgnoreCase = True
        If .test(InputCell.Value) Then
            Set RegMC = .Execute(InputCell)
            For Each RegM In RegMC
                If MyDic.exists(LCase$(RegM)) = False Then
                    For i = RegM.submatches(1) To RegM.submatches(4)
                            tmpStr = "hello" 'InputCell.Offset(0, 3)
                    Next
                    tmpStr = "goodbye"
                    MyDic.Add LCase$(RegM), 1
                End If
            Next
        End If
                CheckNum = tmpStr
    End With
End Function

Open in new window

The regex is OK. It is the rest of the function.

What value do you want returned.

The regex is correctly matching the data, but you are trying to use submatches when you've not captured anything from the regex, just matched it.

Your regex can be read as ...

\b(?:\d[ -]*?){13,16}\b

Options: case insensitive; ^ and $ match at line breaks

Assert position at a word boundary «\b»
Match the regular expression below «(?:\d[ -]*?){13,16}»
   Between 13 and 16 times, as many times as possible, giving back as needed (greedy) «{13,16}»
   Match a single digit 0..9 «\d»
   Match a single character present in the list below «[ -]*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
      The character “ ” « »
      The character “-” «-»
Assert position at a word boundary «\b»


Created with RegexBuddy

Open in new window


Note that you aren't actually capturing anything, just matching.

So Matches and SubMatches would be empty/null/etc.

So, removing the submatches code ...
Function CheckNum(InputCell As Range) As String
    Dim RegEx, RegM, RegMC
    Dim MyDic
    Dim tmpStr As String
    Dim i As Long
    Set MyDic = CreateObject("scripting.dictionary")
    Set RegEx = CreateObject("vbscript.regexp")
    With RegEx
        .Pattern = "\b(?:\d[ -]*?){13,16}\b"
        '.Global = True
        '.IgnoreCase = True
        If .test(InputCell.Value) Then
            Set RegMC = .Execute(InputCell)
            For Each RegM In RegMC
                If MyDic.exists(LCase$(RegM)) = False Then
                    'For i = RegM.submatches(1) To RegM.submatches(4)
                    '        tmpStr = "hello" 'InputCell.Offset(0, 3)
                    'Next
                    tmpStr = "goodbye"
                    MyDic.Add LCase$(RegM), 1
                End If
            Next
        End If
        CheckNum = tmpStr
    End With
End Function

Open in new window


produces output of ...

red	
4456789214872	goodbye
4456-7892-1487-2230	goodbye
1234567891234567	goodbye
blue	

Open in new window


Avatar of Bill Prew
Bill Prew

Not sure I completely understand what you are trying to accomplish, but your problem seems to lie in this statement currently:

                    For i = RegM.submatches(1) To RegM.submatches(4)

You are assuming that there will always be 4 submatches, but I don't think that is the case for some of your test data.

~bp
ASKER CERTIFIED SOLUTION
Avatar of Richard Quadling
Richard Quadling
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
@Billprew - as there is no capturing in the regex, there will be no submatches.
@RQuadling

I thought the matched pattern inside the parens would be considered a submatch, no?  (openly admitting that I am not a master of regex...)

~bp
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thank you for your answers. I'm working on a deadline right now but I'll get back to this within 2 hours I hope.  -John
I think I will be able to get a working solution out of your last two answers.

Thanks!

John