Solved

regular expression to extract switch model and number of ports

Posted on 2011-09-23
10
351 Views
Last Modified: 2012-05-12
The following code extracts 2950 from "WS-C2950G-12-EI"  or "2950-24"   or "2950"

How can I modify it to extract #ports.  IE "2950-12"  "2950-24", "2950"
Sub func2()
    Dim reg As Object
    Dim test
    Dim matches As Object
    Dim m As Object
    Set reg = CreateObject("vbscript.regexp")
     
    For Each test In Array("WS-C2950G-12-EI", "2950-12", "2950")

        With reg
            .Pattern = "(?:\D|\b)(\d{3,4})(?=\D|\b)"
            .Global = True
            Set matches = .execute(test)
        End With
         
        For Each m In matches
            MsgBox m.Submatches(0)
        Next
    Next
End Su

Open in new window

0
Comment
Question by:rberke
  • 6
  • 4
10 Comments
 
LVL 92

Expert Comment

by:Patrick Matthews
ID: 36590113
rberke,

Can't tell if this is VBA or VB6, but this will work either way.

1) Add this function:

Function RegExpFindSubmatch(LookIn As String, PatternStr As String, Optional MatchPos, _
    Optional SubmatchPos, Optional MatchCase As Boolean = True, _
    Optional MultiLine As Boolean = False)
    
    ' Function written by Patrick G. Matthews.  You may use and distribute this code freely,
    ' as long as you properly credit and attribute authorship and the URL of where you
    ' found the code
    
    ' For more info, please see:
    ' http://www.experts-exchange.com/articles/Programming/Languages/Visual_Basic/Using-Regular-Expressions-in-Visual-Basic-for-Applications-and-Visual-Basic-6.html
    
    ' This function relies on the VBScript version of Regular Expressions, and thus some of
    ' the functionality available in Perl and/or .Net may not be available.  The full extent
    ' of what functionality will be available on any given computer is based on which version
    ' of the VBScript runtime is installed on that computer
    
    ' This function uses Regular Expressions to parse a string (LookIn), and return "submatches"
    ' from the various matches to a pattern (PatternStr).  In RegExp, submatches within a pattern
    ' are defined by grouping portions of the pattern within parentheses.
    
    ' Use MatchPos to indicate which match you want:
    ' MatchPos omitted               : function returns results for all matches
    ' MatchPos = 1                   : the first match
    ' MatchPos = 2                   : the second match
    ' MatchPos = <positive integer>  : the Nth match
    ' MatchPos = 0                   : the last match
    ' MatchPos = -1                  : the last match
    ' MatchPos = -2                  : the 2nd to last match
    ' MatchPos = <negative integer>  : the Nth to last match
    
    ' Use SubmatchPos to indicate which match you want:
    ' SubmatchPos omitted               : function returns results for all submatches
    ' SubmatchPos = 1                   : the first submatch
    ' SubmatchPos = 2                   : the second submatch
    ' SubmatchPos = <positive integer>  : the Nth submatch
    ' SubmatchPos = 0                   : the last submatch
    ' SubmatchPos = -1                  : the last submatch
    ' SubmatchPos = -2                  : the 2nd to last submatch
    ' SubmatchPos = <negative integer>  : the Nth to last submatch
    
    ' The return type for this function depends on whether your choice for MatchPos is looking for
    ' a single value or for potentially many.  All arrays returned by this function are zero-based.
    ' When the function returns a 2-D array, the first dimension is for the matches and the second
    ' dimension is for the submatches
    ' MatchPos omitted, SubmatchPos omitted: 2-D array of submatches for each match.  First dimension
    '                                        based on number of matches (0 to N-1), second dimension
    '                                        based on number of submatches (0 to N-1)
    ' MatchPos omitted, SubmatchPos used   : 2-D array (0 to N-1, 0 to 0) of the specified submatch
    '                                        from each match
    ' MatchPos used, SubmatchPos omitted   : 2-D array (0 to 0, 0 to N-1) of the submatches from the
    '                                        specified match
    ' MatchPos used, SubmatchPos used      : String with specified submatch from specified match
    
    ' For any submatch that is not found, the function treats the result as a zero-length string
    
    ' If MatchCase is omitted or True (default for RegExp) then the Pattern must match case (and
    ' thus you may have to use [a-zA-Z] instead of just [a-z] or [A-Z]).
    
    ' If you use this function in Excel, you can use range references for any of the arguments.
    ' If you use this in Excel and return the full array, make sure to set up the formula as an
    ' array formula.  If you need the array formula to go down a column, use TRANSPOSE()
    
    ' Normally as an object variable I would set the RegX variable to Nothing; however, in cases
    ' where a large number of calls to this function are made, making RegX a static variable that
    ' preserves its state in between calls significantly improves performance
    
    Static RegX As Object
    Dim TheMatches As Object
    Dim Mat As Object
    Dim Answer() As String
    Dim Counter As Long
    Dim SubCounter As Long
    
    ' Evaluate MatchPos.  If it is there, it must be numeric and converted to Long
    
    If Not IsMissing(MatchPos) Then
        If Not IsNumeric(MatchPos) Then
            RegExpFindSubmatch = ""
            Exit Function
        Else
            MatchPos = CLng(MatchPos)
        End If
    End If
    
    ' Evaluate SubmatchPos.  If it is there, it must be numeric and converted to Long
    
    If Not IsMissing(SubmatchPos) Then
        If Not IsNumeric(SubmatchPos) Then
            RegExpFindSubmatch = ""
            Exit Function
        Else
            SubmatchPos = CLng(SubmatchPos)
        End If
    End If
    
    ' Create instance of RegExp object
    
    If RegX Is Nothing Then Set RegX = CreateObject("VBScript.RegExp")
    With RegX
        .Pattern = PatternStr
        .Global = True
        .IgnoreCase = Not MatchCase
        .MultiLine = MultiLine
    End With
        
    ' Test to see if there are any matches
    
    If RegX.Test(LookIn) Then
        
        ' Run RegExp to get the matches, which are returned as a zero-based collection
        
        Set TheMatches = RegX.Execute(LookIn)
        
        ' If MatchPos is missing, user either wants array of all the submatches for each match, or an
        ' array of all the specified submatches for each match.  Build it and assign it as the
        ' function's return value
        
        If IsMissing(MatchPos) Then
            
            ' Return value is a 2-D array of all the submatches for each match
            
            If IsMissing(SubmatchPos) Then
                For Counter = 0 To TheMatches.Count - 1
                    Set Mat = TheMatches(Counter)
                    
                    ' To determine how many submatches there are we need to first evaluate a match.  That
                    ' is why we redim the array inside the for/next loop
                    
                    If Counter = 0 Then
                        ReDim Answer(0 To TheMatches.Count - 1, 0 To Mat.Submatches.Count - 1) As String
                    End If
                    
                    ' Loop through the submatches and populate the array.  If the Nth submatch is not
                    ' found, RegExp returns a zero-length string
                    
                    For SubCounter = 0 To UBound(Answer, 2)
                        Answer(Counter, SubCounter) = Mat.Submatches(SubCounter)
                    Next
                Next
            
            ' Return value is a 2-D array of the specified submatch for each match.
            
            Else
                For Counter = 0 To TheMatches.Count - 1
                    Set Mat = TheMatches(Counter)
                    
                    ' To determine how many submatches there are we need to first evaluate a match.  That
                    ' is why we redim the array inside the for/next loop.  If SubmatchPos = 0, then we want
                    ' the last submatch.  In that case reset SubmatchPos so it equals the submatch count.
                    ' Negative number indicates Nth to last; convert that to applicable "positive" position
                    
                    If Counter = 0 Then
                        ReDim Answer(0 To TheMatches.Count - 1, 0 To 0) As String
                        Select Case SubmatchPos
                            Case Is > 0: 'no adjustment needed
                            Case 0, -1: SubmatchPos = Mat.Submatches.Count
                            Case Is < -Mat.Submatches.Count: SubmatchPos = -SubmatchPos
                            Case Else: SubmatchPos = Mat.Submatches.Count + SubmatchPos + 1
                        End Select
                    End If
                    
                    ' Populate array with the submatch value.  If the submatch value is not found, or if
                    ' SubmatchPos > the count of submatches, populate with a zero-length string
                    
                    If SubmatchPos <= Mat.Submatches.Count Then
                        Answer(Counter, 0) = Mat.Submatches(SubmatchPos - 1)
                    Else
                        Answer(Counter, 0) = ""
                    End If
                Next
            End If
            RegExpFindSubmatch = Answer
            
        ' User wanted the info associated with the Nth match (or last match, if MatchPos = 0)
        
        Else
            
            ' If MatchPos = 0 then make MatchPos equal the match count.  If negative (indicates Nth
            ' to last), convert to equivalent position.
            
            Select Case MatchPos
                Case Is > 0: 'no adjustment needed
                Case 0, -1: MatchPos = TheMatches.Count
                Case Is < -TheMatches.Count: MatchPos = -MatchPos
                Case Else: MatchPos = TheMatches.Count + MatchPos + 1
            End Select
            
            ' As long as MatchPos does not exceed the match count, process the Nth match.  If the
            ' match count is exceeded, return a zero-length string
            
            If MatchPos <= TheMatches.Count Then
                Set Mat = TheMatches(MatchPos - 1)
                
                ' User wants a 2-D array of all submatches for the specified match; populate array.  If
                ' a particular submatch is not found, RegExp treats it as a zero-length string
                
                If IsMissing(SubmatchPos) Then
                    ReDim Answer(0 To 0, 0 To Mat.Submatches.Count - 1)
                    For SubCounter = 0 To UBound(Answer, 2)
                        Answer(0, SubCounter) = Mat.Submatches(SubCounter)
                    Next
                    RegExpFindSubmatch = Answer
                
                ' User wants a single value
                
                Else
                    
                    ' If SubmatchPos = 0 then make it equal count of submatches.  If negative, this
                    ' indicates Nth to last; convert to equivalent positive position
                    
                    Select Case SubmatchPos
                        Case Is > 0: 'no adjustment needed
                        Case 0, -1: SubmatchPos = Mat.Submatches.Count
                        Case Is < -Mat.Submatches.Count: SubmatchPos = -SubmatchPos
                        Case Else: SubmatchPos = Mat.Submatches.Count + SubmatchPos + 1
                    End Select
                    
                    ' If SubmatchPos <= count of submatches, then get that submatch for the specified
                    ' match.  If the submatch value is not found, or if SubmathPos exceeds count of
                    ' submatches, return a zero-length string.  In testing, it appeared necessary to
                    ' use CStr to coerce the return to be a zero-length string instead of zero
                    
                    If SubmatchPos <= Mat.Submatches.Count Then
                        RegExpFindSubmatch = CStr(Mat.Submatches(SubmatchPos - 1))
                    Else
                        RegExpFindSubmatch = ""
                    End If
                End If
            Else
                RegExpFindSubmatch = ""
            End If
        End If
    
    ' If there are no matches, return empty string
    
    Else
        RegExpFindSubmatch = ""
    End If
    
Cleanup:
    ' Release object variables
    Set Mat = Nothing
    Set TheMatches = Nothing
    
End Function

Open in new window



2) If this is going into Excel or Access, use an expression like this:

=RegExpFindSubmatch(A1,"(\d{3,4})([A-Z])?(-\d+)?",1,1) & RegExpFindSubmatch(A1,"(\d{3,4})([A-Z])?(-\d+)?",1,3)

3) If this is being used in VBA/VB6, use as you would any other function.

For more about this approach, please see:

http://www.experts-exchange.com/Programming/Languages/Visual_Basic/A_1336-Using-Regular-Expressions-in-Visual-Basic-for-Applications-and-Visual-Basic-6.html

Patrick
0
 
LVL 92

Accepted Solution

by:
Patrick Matthews earned 500 total points
ID: 36590279
Another approach, using another one of the functions from my article http://www.experts-exchange.com/Programming/Languages/Visual_Basic/A_1336-Using-Regular-Expressions-in-Visual-Basic-for-Applications-and-Visual-Basic-6.html ...

1) Add this function:

Function RegExpReplace(LookIn As String, PatternStr As String, Optional ReplaceWith As String = "", _
    Optional ReplaceAll As Boolean = True, Optional MatchCase As Boolean = True, _
    Optional MultiLine As Boolean = False)
    
    ' Function written by Patrick G. Matthews.  You may use and distribute this code freely,
    ' as long as you properly credit and attribute authorship and the URL of where you
    ' found the code
    
    ' For more info, please see:
    ' http://www.experts-exchange.com/articles/Programming/Languages/Visual_Basic/Using-Regular-Expressions-in-Visual-Basic-for-Applications-and-Visual-Basic-6.html
    
    ' This function relies on the VBScript version of Regular Expressions, and thus some of
    ' the functionality available in Perl and/or .Net may not be available.  The full extent
    ' of what functionality will be available on any given computer is based on which version
    ' of the VBScript runtime is installed on that computer
    
    ' This function uses Regular Expressions to parse a string, and replace parts of the string
    ' matching the specified pattern with another string.  The optional argument ReplaceAll
    ' controls whether all instances of the matched string are replaced (True) or just the first
    ' instance (False)
    
    ' If you need to replace the Nth match, or a range of matches, then use RegExpReplaceRange
    ' instead
    
    ' By default, RegExp is case-sensitive in pattern-matching.  To keep this, omit MatchCase or
    ' set it to True
    
    ' If you use this function from Excel, you may substitute range references for all the arguments
    
    ' Normally as an object variable I would set the RegX variable to Nothing; however, in cases
    ' where a large number of calls to this function are made, making RegX a static variable that
    ' preserves its state in between calls significantly improves performance
    
    Static RegX As Object
    
    If RegX Is Nothing Then Set RegX = CreateObject("VBScript.RegExp")
    With RegX
        .Pattern = PatternStr
        .Global = ReplaceAll
        .IgnoreCase = Not MatchCase
        .MultiLine = MultiLine
    End With
    
    RegExpReplace = RegX.Replace(LookIn, ReplaceWith)
    
End Function

Open in new window


2) Use an expression like this:

=RegExpReplace(A1,"(^\D*)((\d{3,4})([A-Z])?(-\d+)?)?(.*$)","$3$5")
0
 
LVL 5

Author Comment

by:rberke
ID: 36590637
Thanks.

Maybe you can tell me why this works:

MsgBox RegExpReplace("WS-C2950Gxxxxx-12-EI", "(^\D*)((\d{3,4})([A-Za-z])*(-\d+)?)?(.*$)", "$3$5")

But this does not:

MsgBox RegExpReplace("WS-C2950Gxxxxx-12-EI", "(^\D*)((\d{3,4})([^0-9])*(-\d+)?)?(.*$)", "$3$5")
0
 
LVL 92

Expert Comment

by:Patrick Matthews
ID: 36590775
I think that second one fails because the * is being "greedy".

That is, your element ([^0-9])* is going to grab as many non-digits as it can, including the - in the -12.  As a result, nothing is found for the submatch (-\d+)?

Some Regular Expressions dialects have a "non-greedy" version of *, but VBScript does not.
0
 
LVL 5

Author Closing Comment

by:rberke
ID: 36590901
I just need to tell expression 4 to not grab minus signs:  ^0-9-   <====see the extra minus which was omitted before.

MsgBox RegExpReplace("WS-C2950Gxxxxx-12-EI", "(^\D*)((\d{3,4})([^0-9-])*(-\d+)?)?(.*$)", "$3$5")
0
Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 92

Expert Comment

by:Patrick Matthews
ID: 36591353
rberke,

Glad to help!  If you have not already done so, I would really appreciate it if you could please return to my article and click 'Yes' for the 'Was this helpful?' voting.

Patrick
0
 
LVL 5

Author Comment

by:rberke
ID: 36593559
It turns out some of my cells have switch names in them. For instance cell a1 has "WS-C2950G-12-EI WS-C1234G-12-EI".

I want b1 to show 2950-12 and c1 to show 1234-12.  

I am solving the problem by putting the following in cell b1 and c1 where subpattern=(?:^\D*)(?:(\d{3,4})(?:[^0-9-])*(-\d+)?)?(?:.*$)

b1 formula =RegExpFindSubmatch(a1,SubPattern,1,1) &RegExpFindSubmatch(a1,SubPattern,1,2)

c1 formula =RegExpFindSubmatch(a1,SubPattern,2,1) &RegExpFindSubmatch(a1,SubPattern,2,2)

Is there anyway to do this as an array formula without me writing a UDF?  Of course I could write it easily, but I think there might be some sort of trick like tacking on a $1$2 parameter somewhere.
0
 
LVL 5

Author Comment

by:rberke
ID: 36593563
typo.  should say some of my cells have  SEVERAL switch names in them
0
 
LVL 5

Author Comment

by:rberke
ID: 36593640
minor issue.  subpattern should only allow a two digit number of ports so changed \d+ to \d{2}

(?:^\D*)(?:(\d{3,4})(?:[^0-9-])*(-\d{2})?)?(?:.*$)

0
 
LVL 5

Author Comment

by:rberke
ID: 36594191
I don't want to muck up you nice article with a bunch of junk, so I am posting some code here.

Since you seemed interested in my idea of the psrdigit routines, I took a shot at converting them to your routine.

I am running out of time, so I am posting snippets that don't quite work together.  Perhaps they will be of some value.  Otherwise, just ignore them

first snippet would be in command interpeter
    ' psremail     extract multiple email address
    ' psrdigit 5   extract multiple 5 digit numbers
    ' psrdigit 2 5 look for between 2 and 5 digits
    ' psrmmddyy    reformat dates from 3/2/11  to 03/02/11


 Case "psremail": Call dbg: Call PasteRegExpToSelection("([\w.%+-]+@[\w.%+-]+\.+[A-Za-z0-9]{2,4})+", param1, True)
    Case "psrdigit", "psr":
            If param1 = "" Then param1 = 6
            Call dbg: Call PasteRegExpToSelection("(?:\D|\b)(\d{99,99})(?=\D|\b)", param1, True)
     Case "psrip":
            If param1 = "" Then param1 = 6
            Call dbg: Call PasteRegExpToSelection("(?:\D|\b)(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})(\D|\b)", param1, True)
    Case "psrmmddyy":
            Call dbg: Call PasteRegExpToSelection("(?:\D|\b)(\d{1,2}[-./ _]\d{1,2}[-./ _]\d{1,2})(\D|\b)", param1, True)
' and this code is nearly done

Sub PasteRegExpToSelection(regexp As String, NumberOfDigits As String, Submatches As Boolean)
        Dim rgxFormula As String, Dummy
        Dim tgtCell As Range
        Set tgtCell = Selection.Cells(1)
        Dim ItemOffset As String
        Dim ItemBase As String
        Dim CommasForSubmatch As String

        Dim Commas As String

  
        ItemBase = tgtCell.Offset(0, -1).Address(False, True)
        ItemOffset = "column(" & tgtCell.Address(False, False) & ")-column(" & tgtCell.Offset(0, -1).Address(False, True) & ")"
        If Submatches Then
            rgxFormula = Replace("=RegExpFindSubmatch(" & ItemBase & ",'" & regexp & "'," & ItemOffset & ")", "'", Chr(34))
        Else
            rgxFormula = Replace("=RegExpFind(" & ItemBase & ",'" & regexp & "'," & ItemOffset & ")", "'", Chr(34))
        End If
            
        If IsNumeric(NumberOfDigits) Then
            NumberOfDigits = "{" & NumberOfDigits & "}"
        End If
        rgxFormula = Replace(rgxFormula, "{99,99}", NumberOfDigits)
        If Selection.Cells.count + Selection.Areas.count = 2 And Len(Selection.Cells(1).formula) = 0 Then
            Selection = rgxFormula
        Else
            Dummy = InputBox("", , rgxFormula)
        End If
End Sub

Open in new window

0

Featured Post

What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

Join & Write a Comment

by Batuhan Cetin Regular expression is a language that we use to edit a string or retrieve sub-strings that meets specific rules from a text. A regular expression can be applied to a set of string variables. There are many RegEx engines for u…
Do you hate spam? I do, and I am willing to bet you do as well. I often wonder, though, "if people hate spam so much, why do they still post their email addresses on the web?" I'm not talking about a plain-text posting here. I am referring to the fa…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now