Solved

String Segment Extract

Posted on 2009-05-19
2
707 Views
Last Modified: 2013-12-20
Looking for string manipulation code to look for and pull out 1 or more occurances of a piece of text that always ends with 2 asterisks and is preceeded by a space, like so:

This table contains a description column containing text that I wish to extract** and place in other columns in my table.  This is a way to extract keywords from articles** and provide a table for keyword lookups.  

So the returned values from the text above would be "extract" and "articles".
0
Comment
Question by:plord1234
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
2 Comments
 
LVL 92

Accepted Solution

by:
Patrick Matthews earned 500 total points
ID: 24424443
The UDF below can produce an array of the matches.  Use it like this:

Dim arr As Variant
Dim Counter As Long

arr = RegExpFind(SomeString, " [a-z]+\*\*", , False)

If IsArray(arr) Then
    For Counter = LBound(arr) To UBound(arr)
        Debug.Print Mid(arr(Counter), 2, Len(arr(Counter)) - 3)
    Next
Else
    'code for no matches found
End If
Function RegExpFind(LookIn As String, PatternStr As String, Optional Pos, _
    Optional MatchCase As Boolean = True) 
    ' For more info see: http://vbaexpress.com/kb/getarticle.php?kb_id=841 
    ' This function uses Regular Expressions to parse a string (LookIn), and return matches to a
    ' pattern (PatternStr).  Use Pos to indicate which match you want:
    ' Pos omitted               : function returns a zero-based array of all matches
    ' Pos = 0                   : the last match
    ' Pos = 1                   : the first match
    ' Pos = 2                   : the second match
    ' Pos = <positive integer>  : the Nth match
    ' If Pos is greater than the number of matches, is negative, or is non-numeric, the function
    ' returns an empty string.  If no match is found, the function returns an empty string
    
    ' If MatchCase is omitted or True (default for RegExp) then the Pattern must match case (and
    ' thus you may have to use [a-zA-Z] instead of just [a-z] or [A-Z]).
    
    ' If you use this function in Excel, you can use range references for any of the arguments.
    ' If you use this in Excel and return the full array, make sure to set up the formula as an
    ' array formula.  If you need the array formula to go down a column, use TRANSPOSE()
    
    Static RegX As Object
    Dim TheMatches As Object
    Dim Answer() As String
    Dim Counter As Long
    
    ' Evaluate Pos.  If it is there, it must be numeric and converted to Long
    If Not IsMissing(Pos) Then
        If Not IsNumeric(Pos) Then
            RegExpFind = ""
            Exit Function
        Else
            Pos = CLng(Pos)
        End If
    End If
    
    ' Create instance of RegExp object
    If RegX Is Nothing Then Set RegX = CreateObject("VBScript.RegExp")
    With RegX
        .Pattern = PatternStr
        .Global = True
        .IgnoreCase = Not MatchCase
    End With
        
    ' Test to see if there are any matches
    If RegX.test(LookIn) Then
        
        ' Run RegExp to get the matches, which are returned as a zero-based collection
        Set TheMatches = RegX.Execute(LookIn)
        
        ' If Pos is missing, user wants array of all matches.  Build it and assign it as the
        ' function's return value
        If IsMissing(Pos) Then
            ReDim Answer(0 To TheMatches.Count - 1) As String
            For Counter = 0 To UBound(Answer)
                Answer(Counter) = TheMatches(Counter)
            Next
            RegExpFind = Answer
        
        ' User wanted the Nth match (or last match, if Pos = 0).  Get the Nth value, if possible
        Else
            Select Case Pos
                Case 0                          ' Last match
                    RegExpFind = TheMatches(TheMatches.Count - 1)
                Case 1 To TheMatches.Count      ' Nth match
                    RegExpFind = TheMatches(Pos - 1)
                Case Else                       ' Invalid item number
                    RegExpFind = ""
            End Select
        End If
    
    ' If there are no matches, return empty string
    Else
        RegExpFind = ""
    End If
    
    ' Release object variables
    Set TheMatches = Nothing
    
End Function

Open in new window

0
 

Author Closing Comment

by:plord1234
ID: 31583129
This was perfect!  Thanks Matthew.
0

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

As tax season makes its return, so does the increase in cyber crime and tax refund phishing that comes with it
Preparing an email is something we should all take special care with – especially when the email is for somebody you may not know very well. The pressures of everyday working life stacked with a hectic office environment can make this a real challen…
Using Microsoft Access, learn some simple rules for how to construct tables in a relational database. Split up all multi-value fields into single values: Split up fields that belong to other things into separate tables: Make sure that all record…
In Microsoft Access, learn the trick to repeating sub-report headings at the top of each page. The problem with sub-reports and headings: Add a dummy group to the sub report using the expression =1: Set the “Repeat Section” property of the dummy…

730 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question