?
Solved

Use Powershell to clean up last names in a csv file

Posted on 2011-03-23
8
Medium Priority
?
1,015 Views
Last Modified: 2012-06-27
Looking for a way to clean up the "last name" column in a CSV file. Some last names are fine "Johnson", but others are not; "Johnson P." or "Hicks MD" or "Renegar DDS PA" or "Ashcraft,". The file has several columns that I'm looking to maintain untouched, and output them with the "cleaned" version of the column last name values. So the output should be something like this:

BEFORE:
Number1  Number2         First Name    Last Name
100552      B7BA7153A      Darwin          Hayes
105212      15A1A75CA      Craig          Hicks DDS PC
105216      EA1357313      Charles          Chilcoat PC
105220      3A1C571EA      Gary                  Renegar Dds Pa
105221      99E59A31C      Michael          Ashcraft,

AFTER
Number1  Number2         First Name    Last Name
100552      B7BA7153A      Darwin          Hayes
105212      15A1A75CA      Craig          Hicks
105216      EA1357313      Charles          Chilcoat
105220      3A1C571EA      Gary                  Renegar
105221      99E59A31C      Michael          Ashcraft
0
Comment
Question by:bndit
  • 3
  • 2
  • 2
  • +1
8 Comments
 
LVL 16

Accepted Solution

by:
Dale Harris earned 1000 total points
ID: 35203596
If you're using a CSV, this should work:

$File = import-csv "Book1.csv"
$File2 = @()
foreach ($line in $File){
$LastName = $line.lastname.split(" ")[0]
$FirstName = $line.firstname
$Number1 = $line.number1
$Number2 = $line.number2
$File2 += "$Number1 $Number2 $Firstname $Lastname"
}

$File2

#This will give you what you want in File2.  You can export it to text or CSV.  But play around with it to see what you have.

HTH,

Dale Harris
0
 
LVL 10

Assisted Solution

by:wls3
wls3 earned 1000 total points
ID: 35203838
This will automatically handle encoding, input file, output file and delimeters (in case you are not using commas).  There is more that could be added, but, this is a good starting point.

Function CleanCSVFile($filename, $delimeter, $output)
{
    $csv = Import-Csv -Path $filename -Delimiter $delimeter
    $b
    foreach($line in $csv)
    {
        $b += $line.Number1 + "," + $line.Number2 + "," + $line."First Name" + "," + $line."Last Name".Split(" ")[0] + "`n"
    }
    Out-File -FilePath $output -Encoding ASCII -InputObject $b
}

Open in new window

It can be used like this:
CleanCSVFile "C:\Users\w\Documents\test.csv" "," "C:\Users\w\Documents\testout.csv"

Open in new window

0
 
LVL 16

Expert Comment

by:Dale Harris
ID: 35203927
wls3,

I like how you did the $Line."First Name".  I didn't know that would work.  Thanks for the tip.

DH
0
Free recovery tool for Microsoft Active Directory

Veeam Explorer for Microsoft Active Directory provides fast and reliable object-level recovery for Active Directory from a single-pass, agentless backup or storage snapshot — without the need to restore an entire virtual machine or use third-party tools.

 
LVL 2

Author Comment

by:bndit
ID: 35204309
I'm calling the function as such...but it's failing

CleanCSVFile ('C:\scripts\output\shortHRList.csv',',','C:\scripts\Output\output1.csv')

I'm probably calling it wrong
error.png
0
 
LVL 2

Author Comment

by:bndit
ID: 35204730
Ok, false alarm...I wasn't familiar with the way parameters are passed to functions in Powershell..and didn't know that parentheses actually evaluate the values to an expression....I changed my call to this:

CleanCSVFile "C:\scripts\output\shortHRList.csv" "," "C:\scripts\Output\output1.csv"

and it's working...thanks both for your feedback!
0
 
LVL 13

Expert Comment

by:soostibi
ID: 35205387
You have to adjust at the import-csv part the path and the delimiter (now it is set to TAB character).
Import-Csv C:\yorfile.csv -Delimiter "`t" | Select-Object number1, number2, "first name", @{n="Last Name"; 
    e = {$_."Last Name" -replace "^(\w+).*",'$1'}}

Open in new window

0
 
LVL 13

Expert Comment

by:soostibi
ID: 35205392

Import-Csv C:\yorfile.csv -Delimiter "`t" | Select-Object number1, number2, "first name", @{n="Last Name";   
    e = {$_."Last Name" -replace "^(\w+).*",'$1'}} | export-csv c:\newfile.csv -notypeinformation

Open in new window

0
 
LVL 2

Author Closing Comment

by:bndit
ID: 35210098
Good answers.
0

Featured Post

Vote for the Most Valuable Expert

It’s time to recognize experts that go above and beyond with helpful solutions and engagement on site. Choose from the top experts in the Hall of Fame or on the right rail of your favorite topic page. Look for the blue “Nominate” button on their profile to vote.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The Nano Server Image Builder helps you create a custom Nano Server image and bootable USB media with the aid of a graphical interface. Based on the inputs you provide, it generates images for deployment and creates reusable PowerShell scripts that …
In this post, I will showcase the steps for how to create groups in Office 365. Office 365 groups allow for ease of flexibility and collaboration between staff members.
Exchange organizations may use the Journaling Agent of the Transport Service to archive messages going through Exchange. However, if the Transport Service is integrated with some email content management application (such as an antispam), the admini…
Exchange organizations may use the Journaling Agent of the Transport Service to archive messages going through Exchange. However, if the Transport Service is integrated with some email content management application (such as an anti-spam), the admin…

864 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question