Text Parsing

Posted on 2011-10-31
Last Modified: 2012-05-12
I have a flat data file that looks like this:

Domain\Aaron.Hack.1 (Aaron Hack)
phone 5154934589
Domain\Frank.Smith (Frank Smith)

Domain\Judy.Smith.7 (Judy Smith)
email Judy.Smith@tony.vom
phone 2605551234
phone 2605551243

I want to produce:  Name,Email1,Email2,phone,phone in a .csv or Access table.  I used to utilize Monarch, but can't load it any longer (not approved by corporate).  Is there a similar tool on the shareware market?
Question by:TIgerV
    LVL 4

    Expert Comment

    by:Ariful Ambia
    Try ms excel data import option.

    Author Comment

    Since there is not a significant delimiter or number of rows, excel data import is not a good solution for this file.  
    LVL 44

    Expert Comment



    Bring the text into a Word document.


    Issue the following three Find Replace All commands:

    Open in new window


    Tweak the result for the first and last record.* Insert a quote character at the start of the very first line
    * change the final "," sequence to a single quote character

    You can save this as CSV or copy/paste it into Notepad for saving as CSV.  If you want to copy/paste it directly into Access, you would use a tab instead of a comma.

    If you make this text into a Word table, you might also be able to directly import it into Access or Excel (treat it like a proper data source).
    LVL 6

    Expert Comment

    You could use vb or try the following.

    Here are some products that look similar:
    LVL 44

    Expert Comment

    Yes.  It is possible to do that conversion with VB/VBScript code, using the Replace() function.
    LVL 92

    Accepted Solution

    This was actually very easy to do using classes.

    1) Add this code to a class module, and rename the class module clsPeople:

    ' Created by Patrick Matthews, Verint Systems
    ' Created 2011-10-31
    ' Parent collection class of clsPerson
    Option Explicit
    Option Compare Text
    ' Container for all clsPerson objects in the parent collection class
    Private coll As Collection
    Private Sub Class_Initialize()
        Set coll = New Collection
    End Sub
    Private Sub Class_Terminate()
        Set coll = Nothing
    End Sub
    Public Function Add(Domain As String) As clsPerson
        ' Adds a new item to the collection.  Causes an error if an item with the same key already exists
        ' or if you pass a zero length string for the Domain argument
        If Domain = "" Then
            Err.Raise vbObjectError + 1002, , "Domain property of clsPerson object cannot be zero length string"
        End If
        Set Add = New clsPerson
        Add.Domain = Domain
        ' Raise an error and set return value to Nothing if we fail to add item to collection (most likely
        ' because an item already exists with the same key
        On Error GoTo ErrHandler
        coll.Add Add, Domain
        Exit Function
        Set Add = Nothing
        Err.Raise vbObjectError + 1003, , "Could not add item '" & Domain & "' to clsPeople collection"
    End Function
    Public Sub Clear()
        ' Recreates (and thus clears) collection
        Set coll = New Collection
    End Sub
    Property Get Count() As Long
        ' Returns number of items in the collection
        ' Read-only
        Count = coll.Count
    End Property
    Function Exists(Domain As String) As Boolean
        ' Returns True if a clsPerson member specified by the Domain exists in clsPeople parent collection
        Dim TempItem As clsPerson
        On Error GoTo CleanUp
        ' Default return is False
        Exists = False
        ' If item exists, then the Set operation completes without error
        Set TempItem = coll(Domain)
        Exists = True
        Set TempItem = Nothing
    End Function
    Function Export()
        Dim Person As clsPerson
        Dim DestR As Long
        [a1:e1] = Array("Name", "Email1", "Email2", "Phone1", "Phone2")
        [d:e].NumberFormat = "@"
        DestR = 1
        For Each Person In Me
            DestR = DestR + 1
            With Person
                Cells(DestR, 1) = .Name
                Cells(DestR, 2) = .Email1
                Cells(DestR, 3) = .Email2
                Cells(DestR, 4) = .Phone1
                Cells(DestR, 5) = .Phone2
            End With
    End Function
    Function Import() As Boolean
        Dim LastR As Long
        Dim arr As Variant
        Dim Counter As Long
        Dim Person As clsPerson
        Dim TestStr As String
        Import = False
        On Error GoTo ErrHandler
        With ActiveSheet
            LastR = .Cells(.Rows.Count, "a").End(xlUp).Row
            arr = .Range("a1:a" & LastR).Value
        End With
        For Counter = 1 To UBound(arr, 1)
            TestStr = arr(Counter, 1)
            If TestStr Like "domain*" Then
                Set Person = Me.Add(TestStr)
                Person.Name = Mid(Person.Domain, InStr(1, Person.Domain, "(") + 1)
                Person.Name = Left(Person.Name, Len(Person.Name) - 1)
            ElseIf TestStr Like "email*" Then
                If Person.Email1 = "" Then
                    Person.Email1 = Mid(TestStr, InStr(1, TestStr, " ") + 1)
                    Person.Email2 = Mid(TestStr, InStr(1, TestStr, " ") + 1)
                End If
            ElseIf TestStr Like "phone*" Then
                If Person.Phone1 = "" Then
                    Person.Phone1 = Mid(TestStr, InStr(1, TestStr, " ") + 1)
                    Person.Phone2 = Mid(TestStr, InStr(1, TestStr, " ") + 1)
                End If
            End If
        Import = True
        Set Person = Nothing
    End Function
    Property Get Item(Index As Variant) As clsPerson
        ' Default property.  Returns an item from the collection.  Index may be either ordinal position (Long) or Domain (String)
        ' Read-only
        On Error GoTo ErrHandler
        Set Item = coll(Index)
        Exit Property
        Set Item = Nothing
        Err.Raise vbObjectError + 1004, , "Item does not exist in clsPeople collection"
    End Property
    Function Keys() As Variant
        ' Returns a 1-based array of the various strings used as Domain key values for the clsPerson items
        ' in the clsPeople collection"
        Dim Counter As Long
        Dim Results() As String
        ' If there are no items in the clsPeople collection then raise an error
        If Me.Count > 0 Then
            ' Redimension array so there is one member per clsPerson item in clsPeople collection
            ReDim Results(1 To Me.Count) As String
            ' Loop through clsPeople collection and grab Domain values for each clsPerson item
            For Counter = 1 To Me.Count
                Results(Counter) = Me(Counter).Domain
            ' Set return value
            Keys = Results
            ' Raise error for no items
            Err.Raise vbObjectError + 1005, , "Keys method failed: no clsPerson items exist in clsPeople collection"
        End If
    End Function
    Public Sub Remove(Index As Variant)
        ' Removes an item from the collection.  Index may be either ordinal position (Long) or Domain (String)
        coll.Remove Index
    End Sub
    Function NewEnum() As IUnknown
        ' Enables enumeration of the clsPeople parent collection, i.e.:
        ' For Each Child In Parent...Next
        Set NewEnum = coll.[_NewEnum]
    End Function

    Open in new window

    2) Add this to another class module, and name it clsPerson

    ' Created by Patrick Matthews, Verint Systems
    ' Created 2011-10-31
    Option Explicit
    Option Compare Text
    ' Container for "write-once read-many" property
    Private Safe_Domain As String
    Public Name As String
    Public Email1 As String
    Public Email2 As String
    Public Phone1 As String
    Public Phone2 As String
    Property Get Domain() As String
        ' Returns item's Domain value
        Domain = Safe_Domain
    End Property
    Property Let Domain(DomainString As String)
        ' Sets Domain value for item
        ' This makes the Domain property "write once, read many".  If the Domain is a zero length string,
        ' the Property Let allows you to change it; if not, the procedure raises a user defined
        ' error.  Basically, we cannot allow changes because we want this property to match the
        ' item's true key used when it was added to the parent clsPeople collection
        If Safe_Domain = "" Then
            Safe_Domain = DomainString
            Err.Raise vbObjectError + 1001, , "Cannot change Domain property of clsPerson object"
        End If
    End Property

    Open in new window

    3) Put this in a regular module:

    Option Explicit
    Sub RunIt()
        Dim Peeps As clsPeople
        Set Peeps = New clsPeople
        With Peeps
        End With
        Set Peeps = Nothing
    End Sub

    Open in new window

    Seems like a lot of code, but nearly all of it was generated automatically by my Parent Class Builder add-in, available at

    Sample file included
    LVL 41

    Expert Comment

    Patrick,  that rocks!
    LVL 2

    Expert Comment

    It looks like you have some solutions from others. For what it's worth here is a VBScript to parse this into a .CSV file. Change the first two variables to the path to your input file and the desired output file path respectively.

    strInputFilePath = "c:\temp\data.txt"
    strOutputFilePath = "c:\temp\output.csv"
    Const ForReading = 1
    Const ForWriting = 2
    intEmail = 1
    intPhone = 1
    Set objFso = CreateObject("Scripting.FileSystemObject")
    Set objInputFile = objFso.OpenTextFile(strInputFilePath, ForReading, False)
    Set objOutputFile = objFso.OpenTextFile(strOutputFilePath, ForWriting, True)
    ''Initialize the dictionary object
    Set objDataDict = CreateObject("Scripting.Dictionary")
    objDataDict.Item("Name") = ""
    objDataDict.Item("Email1") = ""
    objDataDict.Item("Email2") = ""
    objDataDict.Item("Phone1") = ""
    objDataDict.Item("Phone2") = ""
    objOutputFile.WriteLine Join(objDataDict.Keys(), ",")
    Do While Not objInputFile.AtEndOfStream
    	blnProc = False
    	strText = Trim(objInputFile.Readline)
    	Select Case True
    		Case Trim(strText) = ""
    			ReDim aryLineValues(4)
    			intEmail = 1
    			intPhone = 1
    			objOutputFile.WriteLine Join(objDataDict.Items(), ",")
    			For Each strKey In objDataDict.Keys()
    				objDataDict.Item(strKey) =  ""
    			blnProc = True
    		Case InStr(1, strText, "Domain\", 1) > 0
    			strValue = Replace(Right(strText, Len(strText) - InStr(strText, "(")), ")", "")
    			objDataDict.Item("Name") = strValue
    		Case InStr(1, strText, "email", 1) > 0
    			objDataDict.Item("Email" & CStr(intEmail)) = Trim(Replace(UCase(strText), "EMAIL", ""))
    			intEmail = intEmail + 1
    		Case InStr(1, strText, "phone", 1) > 0
    			objDataDict.Item("Phone" & CStr(intPhone)) = Trim(Replace(UCase(strText), "PHONE", ""))
    			intPhone = intPhone + 1
    	End Select
    If Not blnProc Then
    	objOutputFile.WriteLine Join(objDataDict.Items(), ",")
    End If

    Open in new window

    LVL 92

    Expert Comment

    by:Patrick Matthews
    Thanks, Dave!

    That Parent Class Builder add-in makes it really, really easy to implement class module solutions :)


    Featured Post

    Looking for New Ways to Advertise?

    Engage with tech pros in our community with native advertising, as a Vendor Expert, and more.

    Join & Write a Comment

    How to quickly and accurately populate Word documents with Excel data, charts and images (including Automated Bookmark generation) David Miller (dlmille) Synopsis In this article you’ll learn how to use ExcelToWord! to copy data,charts, shapes …
    In a multiple monitor setup, if you don't want to use AutoCenter to position your popup forms, you have a problem: where will they appear?  Sometimes you may have an additional problem: where the devil did they go?  If you last had a popup form open…
    The viewer will learn how to create two correlated normally distributed random variables in Excel, use a normal distribution to simulate the return on different levels of investment in each of the two funds over a period of ten years, and, create a …
    This Micro Tutorial demonstrates in Microsoft Excel how to consolidate your marketing data by creating an interactive charts using form controls. This creates cool drop-downs for viewers of your chart to choose from.

    733 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    19 Experts available now in Live!

    Get 1:1 Help Now