Left Excel Funciton to extract text

Posted on 2011-10-11
Last Modified: 2012-05-12
Dear Excel Gurus,
I need a function to extract text.  Please see my screen shot and attached example.

Question by:BajanPaul
    LVL 26

    Expert Comment


    What are the rules for the Part No. (or for the part of Product that isn't the Part No.)?

    LVL 50

    Expert Comment

    by:barry houdini
    Hello BajanPaul,

    Assuming all part numbers have at least 6 characters try this formula in A2

    =LEFT(B2,FIND(" ",B2,6))

    Assuming "Product" in B2

    regards, barry
    LVL 8

    Expert Comment

    =LEFT($B:$B;FIND(" ";$B:$B)-1)
    =LEFT($B:$B;FIND("  ";$B:$B)-1)

    since you seem to have one space in part numbers such as line 12.
    LVL 26

    Accepted Solution

    What about...

    =IFERROR(LEFT(B3,FIND("  ",B3,1)-1),LEFT(B3,FIND(" ",B3,1)-1))

    So, it looks for two spaces and if it doesn't find that it uses one.

    LVL 50

    Expert Comment

    by:barry houdini
    Sorry, I should have subtracted 1 otherwise you get a trailing space....

    =LEFT(B2,FIND(" ",B2,6)-1)

    LVL 92

    Expert Comment

    by:Patrick Matthews
    In looking at the file...

    Most rows seem to be breaking at the point where there are 2+ consecutive spaces.  However, Row 8 shows a break at a singe space.

    The rule does not appear to be "break atthe first space", because Rows 3 & 4 violate that.

    The following Regular Expressions-powered formula is my best guess at it:

    =RegExpFind(B3,"^[^ ]+( [^ ]+)*(?=( {2,}| [^ ]+$))",1)

    To use that, you'll need to add this function to a regular VBA module:

    Function RegExpFind(LookIn As String, PatternStr As String, Optional Pos, _
        Optional MatchCase As Boolean = True, Optional ReturnType As Long = 0, _
        Optional MultiLine As Boolean = False)
        ' Function written by Patrick G. Matthews.  You may use and distribute this code freely,
        ' as long as you properly credit and attribute authorship and the URL of where you
        ' found the code
        ' For more info, please see:
        ' This function relies on the VBScript version of Regular Expressions, and thus some of
        ' the functionality available in Perl and/or .Net may not be available.  The full extent
        ' of what functionality will be available on any given computer is based on which version
        ' of the VBScript runtime is installed on that computer
        ' This function uses Regular Expressions to parse a string (LookIn), and return matches to a
        ' pattern (PatternStr).  Use Pos to indicate which match you want:
        ' Pos omitted               : function returns a zero-based array of all matches
        ' Pos = 1                   : the first match
        ' Pos = 2                   : the second match
        ' Pos = <positive integer>  : the Nth match
        ' Pos = 0                   : the last match
        ' Pos = -1                  : the last match
        ' Pos = -2                  : the 2nd to last match
        ' Pos = <negative integer>  : the Nth to last match
        ' If Pos is non-numeric, or if the absolute value of Pos is greater than the number of
        ' matches, the function returns an empty string.  If no match is found, the function returns
        ' an empty string.  (Earlier versions of this code used zero for the last match; this is
        ' retained for backward compatibility)
        ' If MatchCase is omitted or True (default for RegExp) then the Pattern must match case (and
        ' thus you may have to use [a-zA-Z] instead of just [a-z] or [A-Z]).
        ' ReturnType indicates what information you want to return:
        ' ReturnType = 0            : the matched values
        ' ReturnType = 1            : the starting character positions for the matched values
        ' ReturnType = 2            : the lengths of the matched values
        ' If you use this function in Excel, you can use range references for any of the arguments.
        ' If you use this in Excel and return the full array, make sure to set up the formula as an
        ' array formula.  If you need the array formula to go down a column, use TRANSPOSE()
        ' Note: RegExp counts the character positions for the Match.FirstIndex property as starting
        ' at zero.  Since VB6 and VBA has strings starting at position 1, I have added one to make
        ' the character positions conform to VBA/VB6 expectations
        ' Normally as an object variable I would set the RegX variable to Nothing; however, in cases
        ' where a large number of calls to this function are made, making RegX a static variable that
        ' preserves its state in between calls significantly improves performance
        Static RegX As Object
        Dim TheMatches As Object
        Dim Answer()
        Dim Counter As Long
        ' Evaluate Pos.  If it is there, it must be numeric and converted to Long
        If Not IsMissing(Pos) Then
            If Not IsNumeric(Pos) Then
                RegExpFind = ""
                Exit Function
                Pos = CLng(Pos)
            End If
        End If
        ' Evaluate ReturnType
        If ReturnType < 0 Or ReturnType > 2 Then
            RegExpFind = ""
            Exit Function
        End If
        ' Create instance of RegExp object if needed, and set properties
        If RegX Is Nothing Then Set RegX = CreateObject("VBScript.RegExp")
        With RegX
            .Pattern = PatternStr
            .Global = True
            .IgnoreCase = Not MatchCase
            .MultiLine = MultiLine
        End With
        ' Test to see if there are any matches
        If RegX.Test(LookIn) Then
            ' Run RegExp to get the matches, which are returned as a zero-based collection
            Set TheMatches = RegX.Execute(LookIn)
            ' Test to see if Pos is negative, which indicates the user wants the Nth to last
            ' match.  If it is, then based on the number of matches convert Pos to a positive
            ' number, or zero for the last match
            If Not IsMissing(Pos) Then
                If Pos < 0 Then
                    If Pos = -1 Then
                        Pos = 0
                        ' If Abs(Pos) > number of matches, then the Nth to last match does not
                        ' exist.  Return a zero-length string
                        If Abs(Pos) <= TheMatches.Count Then
                            Pos = TheMatches.Count + Pos + 1
                            RegExpFind = ""
                            GoTo Cleanup
                        End If
                    End If
                End If
            End If
            ' If Pos is missing, user wants array of all matches.  Build it and assign it as the
            ' function's return value
            If IsMissing(Pos) Then
                ReDim Answer(0 To TheMatches.Count - 1)
                For Counter = 0 To UBound(Answer)
                    Select Case ReturnType
                        Case 0: Answer(Counter) = TheMatches(Counter)
                        Case 1: Answer(Counter) = TheMatches(Counter).FirstIndex + 1
                        Case 2: Answer(Counter) = TheMatches(Counter).Length
                    End Select
                RegExpFind = Answer
            ' User wanted the Nth match (or last match, if Pos = 0).  Get the Nth value, if possible
                Select Case Pos
                    Case 0                          ' Last match
                        Select Case ReturnType
                            Case 0: RegExpFind = TheMatches(TheMatches.Count - 1)
                            Case 1: RegExpFind = TheMatches(TheMatches.Count - 1).FirstIndex + 1
                            Case 2: RegExpFind = TheMatches(TheMatches.Count - 1).Length
                        End Select
                    Case 1 To TheMatches.Count      ' Nth match
                        Select Case ReturnType
                            Case 0: RegExpFind = TheMatches(Pos - 1)
                            Case 1: RegExpFind = TheMatches(Pos - 1).FirstIndex + 1
                            Case 2: RegExpFind = TheMatches(Pos - 1).Length
                        End Select
                    Case Else                       ' Invalid item number
                        RegExpFind = ""
                End Select
            End If
        ' If there are no matches, return empty string
            RegExpFind = ""
        End If
        ' Release object variables
        Set TheMatches = Nothing
    End Function

    Open in new window

    For more about that, please see my article

    The attached file demonstrates my approach.
    LVL 92

    Expert Comment

    by:Patrick Matthews
    Translating the pattern string "^[^ ]+( [^ ]+)*(?=( {2,}| [^ ]+$))" ...

    ^ is the beginning of input

    [^ ]+ is one or more non-spaces

    ( [^ ]+)* is 0 or more instances of a single space followed by one or more non-spaces

    (?=( {2,}| [^ ]+$)) is a positive lookahead.  It's saying that the above expression must be followed by either 2 or more spaces, or by a single space followed by one or more non-spaces followed by the end of input.  In a positive lookahead, the pattern must match, but that portion is not included in the returned match.

    LVL 50

    Expert Comment

    by:barry houdini
    Sorry, I still got my suggestion wrong, third time lucky - assuming part numbers have at least 6 characters

    =LEFT(B2,FIND(" ",B2,7)-1)

    I believe that works for all your examples takes everything that comes before the first space after character 6 - that might not completely suit your requirements


    Author Closing Comment

    Worked Perfectly.

    LVL 26

    Expert Comment

    Thanks, BajanPaul!

    Featured Post

    Looking for New Ways to Advertise?

    Engage with tech pros in our community with native advertising, as a Vendor Expert, and more.

    Join & Write a Comment

    Entering a date in Microsoft Access can be tricky. A typo can cause month and day to be shuffled, entering the day only causes an error, as does entering, say, day 31 in June. This article shows how an inputmask supported by code can help the user a…
    Using Word 2013, I was experiencing some incredible lag when typing.  Here's what worked for me....
    Learn how to make your own table of contents in Microsoft Word using paragraph styles and the automatic table of contents tool. We'll be using the paragraph styles in Word’s Home toolbar to help you create a table of contents. Type out your initial …
    This Micro Tutorial demonstrate the bugs in Microsoft Excel for Mac with Pivot Charts.

    755 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    18 Experts available now in Live!

    Get 1:1 Help Now