Cleansing CSV File

I need help cleansing and sorting the the attached file (extract only).  The following actions need to occur for my data to be cleansed:

1. Column A - remove all data after the first email address i.e. remove all data including the space after the email address leaving you with just a simple email address and not mention of SMTP etc

2. Column B,C,D - Extract Only First Name (where one exists) OR If there is not a first name and only Title + Surname THEN return Title + Surname in a single cell.

4. Column E - Select only those people in the list whose criteria in this column = 'DSA'

I am looking to create a comma separated file with the following syntax of all data in my list - with only those people whose 'category' = 'DSA':


email[separator]name[new line]
email[separator]name[new line] etc etc

The objective is that my newsletter application imports all data and the is able to personalise all letters/newsletters

Example

Dear David, (see syntax above) - assuming first name avalable
Dear Mr Scott, (see syntax above) - assuming no first name available

Contacts-CSV-File-Sample.CSV
davidascottAsked:
Who is Participating?
 
answer_dudeCommented:
RE:   removing "Dear" -- you would not delete the entire line, just the portion of the line that inserts "Dear" -- the new snippet below does that.

RE:  ERS category... do you want both the ERS category and the DSA category or just ERS?  (snippet attached does both).  HOWEVER -- you have categories such as "Framers" and "Suppliers" which also contain the letters "ERS" -- I've added another block to remove these, specifically... but if you ever have other categories that contain "ERS" this code would fail.

RE:  Final result... did you want the new email address and the salutation name to be in the same field with a comma separating the two?  That's what the new snippet does.  However, there's no reason why you can't have both in their own field, when you save as .csv file Excel will put commas between them -- nonetheless, the code below should work.

Finally, just a warning... the last sample file you posted has a different format -- the code I've given you assumes the same format each time -- it can be made more robust (by tying to column heading names) but even then there needs to be consistency across files..
Sub ReformatCSV()
On Error Resume Next
    Dim FindLastRow As Long
    Dim FindLastColumn As Long
    Dim x As Long
    Dim rRange As Range
        
    ' Delete all rows that do not have Categories containing DSA or ERS
    With ActiveSheet
         If .AutoFilterMode = False Then .Cells(1, 1).AutoFilter
                .Range("A1").AutoFilter Field:=6, Criteria1:="<>*DSA*", Operator:=xlAnd, _
                                                  Criteria2:="<>*ERS*"
                .Range("A1").CurrentRegion.Offset(1, 0).SpecialCells _
                        (xlCellTypeVisible).EntireRow.Delete

        .AutoFilterMode = False

    End With
    
    'Delete known categories with "ERS" in the name
    With ActiveSheet
         If .AutoFilterMode = False Then .Cells(1, 1).AutoFilter
                .Range("A1").AutoFilter Field:=6, Criteria1:="=Suppliers", Operator:=xlOr, _
                                                  Criteria2:="=Framers"
                .Range("A1").CurrentRegion.Offset(1, 0).SpecialCells _
                        (xlCellTypeVisible).EntireRow.Delete

        .AutoFilterMode = False
    End With
    
    FindLastRow = 0
    If WorksheetFunction.CountA(Cells) > 0 Then
        'Search for any entry, by searching backwards by Rows.
        FindLastRow = Cells.Find(What:="*", After:=[A1], _
                      SearchOrder:=xlByRows, _
                      SearchDirection:=xlPrevious).Row
    End If
    FindLastColumn = 0
    If WorksheetFunction.CountA(Cells) > 0 Then
        'Search for any entry, by searching backwards by Columns.
        FindLastColumn = Cells.Find(What:="*", After:=[IV1], _
                         SearchOrder:=xlByColumns, _
                         SearchDirection:=xlPrevious).Column
    End If

    Set rRange = ActiveSheet.Range(Cells(1, 1), Cells(FindLastRow, FindLastColumn))
    x = 2
    With rRange
        Do While x <= FindLastRow
        
            .Cells(x, 7).Value = Left(.Cells(x, 1), InStr(1, .Cells(x, 1), " ") - 1)
            .Cells(x, 7) = .Cells(x, 7) & "," & IIf(IsEmpty(.Cells(x, 4)), .Cells(x, 3) & " " & .Cells(x, 5), .Cells(x, 4))
            x = x + 1
            
        Loop
    End With

End Sub

Open in new window

0
 
answer_dudeCommented:
If you put the code below in another workbook, keep it open, then toggle to your CSV file then run the code from there this should work

Caution:  this code deletes all rows that do not have "DSA" in column E

Sub ReformatCSV()

    Dim FindLastRow As Long
    Dim FindLastColumn As Long
    Dim x As Long
    Dim rRange As Range
        
    ' Delete all rows that do not have Categories = DSA
    With ActiveSheet
         If .AutoFilterMode = False Then .Cells(1, 1).AutoFilter
                .Range("A1").AutoFilter Field:=5, Criteria1:="<>DSA"
                .Range("A1").CurrentRegion.Offset(1, 0).SpecialCells _
                        (xlCellTypeVisible).EntireRow.Delete

        .AutoFilterMode = False

    End With
        
    FindLastRow = 0
    If WorksheetFunction.CountA(Cells) > 0 Then
        'Search for any entry, by searching backwards by Rows.
        FindLastRow = Cells.Find(What:="*", After:=[A1], _
                      SearchOrder:=xlByRows, _
                      SearchDirection:=xlPrevious).Row
    End If
    FindLastColumn = 0
    If WorksheetFunction.CountA(Cells) > 0 Then
        'Search for any entry, by searching backwards by Columns.
        FindLastColumn = Cells.Find(What:="*", After:=[IV1], _
                         SearchOrder:=xlByColumns, _
                         SearchDirection:=xlPrevious).Column
    End If

    Set rRange = ActiveSheet.Range(Cells(1, 1), Cells(FindLastRow, FindLastColumn))
    x = 2
    With rRange
        Do While x <= FindLastRow
        
            .Cells(x, 1).Value = Left(.Cells(x, 1), InStr(1, .Cells(x, 1), " ") - 1)
            .Cells(x, 6) = IIf(IsEmpty(.Cells(x, 3)), "Dear " & .Cells(x, 2) & " " & .Cells(x, 4) & ", ", "Dear " & .Cells(x, 3) & ", ")
            x = x + 1
            
        Loop
    End With

End Sub

Open in new window

0
 
answer_dudeCommented:
I should add that the above snippet puts your salutation in column F.
0
Get expert help—faster!

Need expert help—fast? Use the Help Bell for personalized assistance getting answers to your important questions.

 
davidascottAuthor Commented:
Sorry for the delay in responding

Thanks, the code works a treat.  Because the newsletter in the CMS inserts the 'dear' itself.  Do I simply remove line 40 code to give me a result that just ends with (a) first name only or (b) if no first name given then Mr/Mrs Snell as a result.

Thanks in advance
0
 
davidascottAuthor Commented:
Note 1:
Just realised that when searching through the entire list, many of the records have multiple categories and are 'semi-colon' delimited.  I have attached a revised sample. Please would you take this into account as well as the last update above.

Note 2
Please would you change the 'category' that the search looks to 'ERS' as the ERS 'category' would also be required in terms of producing the final output

Note 3
Can the code also produce just the final result in the syntax required by the CMS

email[separator]name[new line]
email[separator]name[new line]

Thanks, David
0
 
davidascottAuthor Commented:
Sorry....see revised sample file attached.

Best, David
Contacts-CSV-File-Sample-v2.CSV
0
 
davidascottAuthor Commented:
The separator as referred to in the comment before last should remain comma delimited
0
 
davidascottAuthor Commented:
Please ignore last updated sample file. Please refer to the one attached to this post i.e. v3
Contacts-CSV-File-Sample-v3.CSV
0
 
davidascottAuthor Commented:
Your output was great - thank you
0
 
davidascottAuthor Commented:
Great solution and a real time saver. Thank You.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.