We help IT Professionals succeed at work.

RegEx in Excel that will match a string of numbers between x and y characters

John Carney
John Carney used Ask the Experts™
on
I'm trying to learn more about RegEx and I'm starting by simply modifying a working code that I got from some of the experts here with a new pattern. Please take a look at my code and workbook and let me know what the correct code would be for this pattern.

Thanks,
John

========
Prior related question:
http://www.experts-exchange.com/Q_27244735.html
RegExWorkshop1.xls
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Top Expert 2010

Commented:
John,

Please try describing with sentences what you are trying to accomplish with the pattern.

Patrick
John CarneyReliability Business Tools Analyst II

Author

Commented:
Hi Patrick, thanks for responding so fast.

Basically, I want D5 (=checknum(C5)) to return something (as opposed to the error it's producing). In my revised code below I think I'm looking for "hello" or "goodbye", I'm not sure which.

Does that explain it well enough?

Thanks,
John






Function CheckNum(InputCell As Range) As String
    Dim RegEx, RegM, RegMC
    Dim MyDic
    Dim tmpStr As String
    Dim i As Long
    Set MyDic = CreateObject("scripting.dictionary")
    Set RegEx = CreateObject("vbscript.regexp")
    With RegEx
        .Pattern = "\b(?:\d[ -]*?){13,16}\b"
        '.Global = True
        '.IgnoreCase = True
        If .test(InputCell.Value) Then
            Set RegMC = .Execute(InputCell)
            For Each RegM In RegMC
                If MyDic.exists(LCase$(RegM)) = False Then
                    For i = RegM.submatches(1) To RegM.submatches(4)
                            tmpStr = "hello" 'InputCell.Offset(0, 3)
                    Next
                    tmpStr = "goodbye"
                    MyDic.Add LCase$(RegM), 1
                End If
            Next
        End If
                CheckNum = tmpStr
    End With
End Function

Open in new window

Richard QuadlingSenior Software Developer

Commented:
The regex is OK. It is the rest of the function.

What value do you want returned.

The regex is correctly matching the data, but you are trying to use submatches when you've not captured anything from the regex, just matched it.

Your regex can be read as ...

\b(?:\d[ -]*?){13,16}\b

Options: case insensitive; ^ and $ match at line breaks

Assert position at a word boundary «\b»
Match the regular expression below «(?:\d[ -]*?){13,16}»
   Between 13 and 16 times, as many times as possible, giving back as needed (greedy) «{13,16}»
   Match a single digit 0..9 «\d»
   Match a single character present in the list below «[ -]*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
      The character “ ” « »
      The character “-” «-»
Assert position at a word boundary «\b»


Created with RegexBuddy

Open in new window


Note that you aren't actually capturing anything, just matching.

So Matches and SubMatches would be empty/null/etc.

Richard QuadlingSenior Software Developer

Commented:
So, removing the submatches code ...
Function CheckNum(InputCell As Range) As String
    Dim RegEx, RegM, RegMC
    Dim MyDic
    Dim tmpStr As String
    Dim i As Long
    Set MyDic = CreateObject("scripting.dictionary")
    Set RegEx = CreateObject("vbscript.regexp")
    With RegEx
        .Pattern = "\b(?:\d[ -]*?){13,16}\b"
        '.Global = True
        '.IgnoreCase = True
        If .test(InputCell.Value) Then
            Set RegMC = .Execute(InputCell)
            For Each RegM In RegMC
                If MyDic.exists(LCase$(RegM)) = False Then
                    'For i = RegM.submatches(1) To RegM.submatches(4)
                    '        tmpStr = "hello" 'InputCell.Offset(0, 3)
                    'Next
                    tmpStr = "goodbye"
                    MyDic.Add LCase$(RegM), 1
                End If
            Next
        End If
        CheckNum = tmpStr
    End With
End Function

Open in new window


produces output of ...

red	
4456789214872	goodbye
4456-7892-1487-2230	goodbye
1234567891234567	goodbye
blue	

Open in new window


Bill PrewTest your restores, not your backups...
Top Expert 2016

Commented:
Not sure I completely understand what you are trying to accomplish, but your problem seems to lie in this statement currently:

                    For i = RegM.submatches(1) To RegM.submatches(4)

You are assuming that there will always be 4 submatches, but I don't think that is the case for some of your test data.

~bp
Senior Software Developer
Commented:
I'm not sure how the dictionary will work as you are not making it persistent between calls. So it is pretty redundant.

So
Function CheckNum(InputCell As Range) As String
    Dim RegEx As Object
   
    Set RegEx = CreateObject("vbscript.regexp")
    
    With RegEx
        .Pattern = "\b(?:\d[ -]*?){13,16}\b"
        If .test(InputCell.Value) Then
            CheckNum = "Valid"
        Else
            CheckNum = "Invalid"
        End If
    End With

End Function

Open in new window


returns ...

red	Invalid
4456789214872	Valid
4456-7892-1487-2230	Valid
1234567891234567	Valid
blue	Invalid
red	Invalid
4456789214872	Valid
4456-7892-1487-2230	Valid
1234567891234567	Valid
blue	Invalid

Open in new window


I was trying to see how to do the dictionary, but I'm not sure.
Richard QuadlingSenior Software Developer

Commented:
@Billprew - as there is no capturing in the regex, there will be no submatches.
Bill PrewTest your restores, not your backups...
Top Expert 2016

Commented:
@RQuadling

I thought the matched pattern inside the parens would be considered a submatch, no?  (openly admitting that I am not a master of regex...)

~bp
Richard QuadlingSenior Software Developer
Commented:
@BillPrew - unless the rules are different for the scripting regex, then no.

There is the match (which will be the entire string) and then any captured elements (submatches).

So, for example.

`(?:\d{4}-\d{2}-\d{2})`

is a pattern to match a date, it won't capture any part other than the match.

But ...

`(\d{4}-\d{2}-\d{2})`

and

`(\d{4})-(\d{2})-(\d{2})`

will.



I'm also not a Scripting RegEx expert - I use PCRE in PHP mainly, but a tool called RegexBuddy is a VERY useful tool!
John CarneyReliability Business Tools Analyst II

Author

Commented:
Thank you for your answers. I'm working on a deadline right now but I'll get back to this within 2 hours I hope.  -John
John CarneyReliability Business Tools Analyst II

Author

Commented:
I think I will be able to get a working solution out of your last two answers.

Thanks!

John