Complex text extraction

Below is a data in cell a2.  in cell K2 in need to extract the city, in L2, i need the State, and M2 the zip code.  These values will change, so the only permamanet text is "Known as:" and "Borrower"  so the city will be between both commas all the time, then the state then zip code.  

Situated in the City of xxxx, County of xxx and State of xx, and known as being all of Lot 15 in the blah blah foundation, as recorded in Plat Book xx, Pages xx, xx, found County Records of Plats, be the same more or less, but subject to all legal highways. Permanent Parcel No. xx-xxxxx1 Known as:  155 Goodhue Dr., Akron, Ohio 44313 Borrower  understands and agrees that xxxx, LLC holds only legal title to the interests granted by Borrower in this mortgage, but, if necessary to comply with law or custom, xxxx, LLC (as
sandramacAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Mark EdwardsChief Technology OfficerCommented:
Below what?  What are you talking about?  Do you have a single cell with data that you need to parse and put into other cells?  What are you wanting to do???????????  This doesn't make any sense as a problem or a question.
Subodh Tiwari (Neeraj)Excel & VBA ExpertCommented:
Place the following User Defined Functions on a Standard Module like Module1.
Function getCity(str As String) As String
With CreateObject("VBScript.RegExp")
    .Global = False
    .Pattern = ""City of ([A-Za-z]+),""
    If .test(str) Then
        getCity = .Execute(str)(0).submatches(0)
    End If
End With
End Function

Function getState(str As String) As String
With CreateObject("VBScript.RegExp")
    .Global = False
    .Pattern = "([A-Za-z]+)\s\d{5}\sBorrower"
    If .test(str) Then
        getState = .Execute(str)(0).submatches(0)
    End If
End With
End Function

Function getZip(str As String) As String
With CreateObject("VBScript.RegExp")
    .Global = False
    .Pattern = "[A-Za-z]+\s(\d{5})\sBorrower"
    If .test(str) Then
        getZip = .Execute(str)(0).submatches(0)
    End If
End With
End Function

Open in new window


And then place the following formulas on the Sheet.

In K2
=getCity(A2)

Open in new window


in L2
=getState(A2)

Open in new window


in M2
=getZip(A2)

Open in new window

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
sandramacAuthor Commented:
In cell A1 this is what is in the cell:  

Situated in the City of xxxx, County of xxx and State of xx, and known as being all of Lot 15 in the blah blah foundation, as recorded in Plat Book xx, Pages xx, xx, found County Records of Plats, be the same more or less, but subject to all legal highways. Permanent Parcel No. xx-xxxxx1 Known as:  155 Goodhue Dr., Akron, Ohio 44313 Borrower  understands and agrees that xxxx, LLC holds only legal title to the interests granted by Borrower in this mortgage, but, if necessary to comply with law or custom, xxxx, LLC (as

I need to extract Akron in one cell, Ohio in another cell, and 44313 in another cell.
Your Guide to Achieving IT Business Success

The IT Service Excellence Tool Kit has best practices to keep your clients happy and business booming. Inside, you’ll find everything you need to increase client satisfaction and retention, become more competitive, and increase your overall success.

Subodh Tiwari (Neeraj)Excel & VBA ExpertCommented:
In that case, replace the first function getCity with the following one...

Function getCity(str As String) As String
With CreateObject("VBScript.RegExp")
    .Global = False
    .Pattern = "([A-Za-z]+),\s[A-Za-z]+\s\d{5}\sBorrower"
    If .test(str) Then
        getCity = .Execute(str)(0).submatches(0)
    End If
End With
End Function

Open in new window

ExtractText.xlsm
Mark EdwardsChief Technology OfficerCommented:
ahhhhh…. simple text parsing.  got it.
You just need to find the position in the string of "Known as:", then find the 1st comma, that will be the beginning of the city.  Then find the comma after city and the word "Borrower" and that will be state and zipcode in between.  If the zipcode is always 5 characters, then everything tp the left of the last 5 characters will be the state.

I'll make a quick subroutine that will do that in my next post.
Mark EdwardsChief Technology OfficerCommented:
OK, It's simple and it works, but it's not fancy.  The subroutine below gave me the city, state, and zip from your "blob" of text.
Private Sub Command3_Click()

    Dim strBlob As String
    Dim strCity As String
    Dim strState As String
    Dim strZip As String
    Dim x As Integer
    Dim y As Integer
    Dim strWorking As String

    'LETS GET THE BLOB OF TEXT WE NEED TO PARSE:
    strBlob = "Situated in the City of xxxx, County of xxx and State of xx, and known as being all of Lot 15 in the blah blah foundation, as recorded in Plat Book xx, Pages xx, xx, found County Records of Plats, be the same more or less, but subject to all legal highways. Permanent Parcel No. xx-xxxxx1 Known as:  155 Goodhue Dr., Akron, Ohio 44313 Borrower  understands and agrees that xxxx, LLC holds only legal title to the interests granted by Borrower in this mortgage, but, if necessary to comply with law or custom, xxxx, LLC (as"
    'LETS GET THE POSITION OF "Known as:"
    x = InStr(1, strBlob, "Known as:")
    'NOW LETS GET THE POSITION OF THEEND OF "Known as:"
    x = x + Len("Known as:") + 1
    'LETS GET RID OF EVERYTHING TO THE LEFT OF X:
    strWorking = Trim(Mid(strBlob, x))
    'NOW LETS FIND THE 1ST COMMA AND GET RID OF THE ADDRESS:
    strWorking = Trim(Mid(strWorking, InStr(1, strWorking, ",") + 1))
    'NOW LETS FIND THE POSITION OF BORROWER:
    y = InStr(1, strWorking, "Borrower")
    'LETS GET RID OF "BORROWER" AND EVERYTHING AFTER THAT:
    strWorking = Trim(Left(strWorking, y - 1))
    'WHAT WE HAVE LEFT IS THE STRING CONTAINING THE CITY, STATE, AND ZIP:
    'NOW LETS PARSE THE PIECES:
    'LETS START WITH THE CITY by getting the position of the first comma:
    x = InStr(1, strWorking, ",")
    'parse the city:
    strCity = Trim(Left(strWorking, x - 1))
    'now lets get rid of the 1st comma and everything to the left so all we have left is state and zipcode:
    strWorking = Trim(Mid(strWorking, x + 1))
    'lets get the 5 zipcode characters:
    strZip = Trim(Right(strWorking, 5))
    'what's left is state:
    strWorking = Trim(Left(strWorking, Len(strWorking) - 6))
    strState = Trim(strWorking)
    
    MsgBox "City = " & strCity & vbCrLf & "State = " & strState & vbCrLf & "Zipcode = " & strZip
    
End Sub

Open in new window

sandramacAuthor Commented:
Thank you Mark, I like that also
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Law

From novice to tech pro — start learning today.