Solved

Use Powershell to clean up last names in a csv file

Posted on 2011-03-23
8
982 Views
Last Modified: 2012-06-27
Looking for a way to clean up the "last name" column in a CSV file. Some last names are fine "Johnson", but others are not; "Johnson P." or "Hicks MD" or "Renegar DDS PA" or "Ashcraft,". The file has several columns that I'm looking to maintain untouched, and output them with the "cleaned" version of the column last name values. So the output should be something like this:

BEFORE:
Number1  Number2         First Name    Last Name
100552      B7BA7153A      Darwin          Hayes
105212      15A1A75CA      Craig          Hicks DDS PC
105216      EA1357313      Charles          Chilcoat PC
105220      3A1C571EA      Gary                  Renegar Dds Pa
105221      99E59A31C      Michael          Ashcraft,

AFTER
Number1  Number2         First Name    Last Name
100552      B7BA7153A      Darwin          Hayes
105212      15A1A75CA      Craig          Hicks
105216      EA1357313      Charles          Chilcoat
105220      3A1C571EA      Gary                  Renegar
105221      99E59A31C      Michael          Ashcraft
0
Comment
Question by:bndit
  • 3
  • 2
  • 2
  • +1
8 Comments
 
LVL 16

Accepted Solution

by:
Dale Harris earned 250 total points
ID: 35203596
If you're using a CSV, this should work:

$File = import-csv "Book1.csv"
$File2 = @()
foreach ($line in $File){
$LastName = $line.lastname.split(" ")[0]
$FirstName = $line.firstname
$Number1 = $line.number1
$Number2 = $line.number2
$File2 += "$Number1 $Number2 $Firstname $Lastname"
}

$File2

#This will give you what you want in File2.  You can export it to text or CSV.  But play around with it to see what you have.

HTH,

Dale Harris
0
 
LVL 10

Assisted Solution

by:wls3
wls3 earned 250 total points
ID: 35203838
This will automatically handle encoding, input file, output file and delimeters (in case you are not using commas).  There is more that could be added, but, this is a good starting point.

Function CleanCSVFile($filename, $delimeter, $output)
{
    $csv = Import-Csv -Path $filename -Delimiter $delimeter
    $b
    foreach($line in $csv)
    {
        $b += $line.Number1 + "," + $line.Number2 + "," + $line."First Name" + "," + $line."Last Name".Split(" ")[0] + "`n"
    }
    Out-File -FilePath $output -Encoding ASCII -InputObject $b
}

Open in new window

It can be used like this:
CleanCSVFile "C:\Users\w\Documents\test.csv" "," "C:\Users\w\Documents\testout.csv"

Open in new window

0
 
LVL 16

Expert Comment

by:Dale Harris
ID: 35203927
wls3,

I like how you did the $Line."First Name".  I didn't know that would work.  Thanks for the tip.

DH
0
Problems using Powershell and Active Directory?

Managing Active Directory does not always have to be complicated.  If you are spending more time trying instead of doing, then it's time to look at something else. For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why

 
LVL 2

Author Comment

by:bndit
ID: 35204309
I'm calling the function as such...but it's failing

CleanCSVFile ('C:\scripts\output\shortHRList.csv',',','C:\scripts\Output\output1.csv')

I'm probably calling it wrong
error.png
0
 
LVL 2

Author Comment

by:bndit
ID: 35204730
Ok, false alarm...I wasn't familiar with the way parameters are passed to functions in Powershell..and didn't know that parentheses actually evaluate the values to an expression....I changed my call to this:

CleanCSVFile "C:\scripts\output\shortHRList.csv" "," "C:\scripts\Output\output1.csv"

and it's working...thanks both for your feedback!
0
 
LVL 13

Expert Comment

by:soostibi
ID: 35205387
You have to adjust at the import-csv part the path and the delimiter (now it is set to TAB character).
Import-Csv C:\yorfile.csv -Delimiter "`t" | Select-Object number1, number2, "first name", @{n="Last Name"; 
    e = {$_."Last Name" -replace "^(\w+).*",'$1'}}

Open in new window

0
 
LVL 13

Expert Comment

by:soostibi
ID: 35205392

Import-Csv C:\yorfile.csv -Delimiter "`t" | Select-Object number1, number2, "first name", @{n="Last Name";   
    e = {$_."Last Name" -replace "^(\w+).*",'$1'}} | export-csv c:\newfile.csv -notypeinformation

Open in new window

0
 
LVL 2

Author Closing Comment

by:bndit
ID: 35210098
Good answers.
0

Featured Post

NAS Cloud Backup Strategies

This article explains backup scenarios when using network storage. We review the so-called “3-2-1 strategy” and summarize the methods you can use to send NAS data to the cloud

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Utilizing an array to gracefully append to a list of EmailAddresses
How to sign a powershell script so you can prevent tampering, and only allow users to run authorised Powershell scripts
Nobody understands Phishing better than an anti-spam company. That’s why we are providing Phishing Awareness Training to our customers. According to a report by Verizon, only 3% of targeted users report malicious emails to management. With compan…
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…

776 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question