Solved

Use Powershell to clean up last names in a csv file

Posted on 2011-03-23
8
992 Views
Last Modified: 2012-06-27
Looking for a way to clean up the "last name" column in a CSV file. Some last names are fine "Johnson", but others are not; "Johnson P." or "Hicks MD" or "Renegar DDS PA" or "Ashcraft,". The file has several columns that I'm looking to maintain untouched, and output them with the "cleaned" version of the column last name values. So the output should be something like this:

BEFORE:
Number1  Number2         First Name    Last Name
100552      B7BA7153A      Darwin          Hayes
105212      15A1A75CA      Craig          Hicks DDS PC
105216      EA1357313      Charles          Chilcoat PC
105220      3A1C571EA      Gary                  Renegar Dds Pa
105221      99E59A31C      Michael          Ashcraft,

AFTER
Number1  Number2         First Name    Last Name
100552      B7BA7153A      Darwin          Hayes
105212      15A1A75CA      Craig          Hicks
105216      EA1357313      Charles          Chilcoat
105220      3A1C571EA      Gary                  Renegar
105221      99E59A31C      Michael          Ashcraft
0
Comment
Question by:bndit
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
  • +1
8 Comments
 
LVL 16

Accepted Solution

by:
Dale Harris earned 250 total points
ID: 35203596
If you're using a CSV, this should work:

$File = import-csv "Book1.csv"
$File2 = @()
foreach ($line in $File){
$LastName = $line.lastname.split(" ")[0]
$FirstName = $line.firstname
$Number1 = $line.number1
$Number2 = $line.number2
$File2 += "$Number1 $Number2 $Firstname $Lastname"
}

$File2

#This will give you what you want in File2.  You can export it to text or CSV.  But play around with it to see what you have.

HTH,

Dale Harris
0
 
LVL 10

Assisted Solution

by:wls3
wls3 earned 250 total points
ID: 35203838
This will automatically handle encoding, input file, output file and delimeters (in case you are not using commas).  There is more that could be added, but, this is a good starting point.

Function CleanCSVFile($filename, $delimeter, $output)
{
    $csv = Import-Csv -Path $filename -Delimiter $delimeter
    $b
    foreach($line in $csv)
    {
        $b += $line.Number1 + "," + $line.Number2 + "," + $line."First Name" + "," + $line."Last Name".Split(" ")[0] + "`n"
    }
    Out-File -FilePath $output -Encoding ASCII -InputObject $b
}

Open in new window

It can be used like this:
CleanCSVFile "C:\Users\w\Documents\test.csv" "," "C:\Users\w\Documents\testout.csv"

Open in new window

0
 
LVL 16

Expert Comment

by:Dale Harris
ID: 35203927
wls3,

I like how you did the $Line."First Name".  I didn't know that would work.  Thanks for the tip.

DH
0
Problems using Powershell and Active Directory?

Managing Active Directory does not always have to be complicated.  If you are spending more time trying instead of doing, then it's time to look at something else. For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why

 
LVL 2

Author Comment

by:bndit
ID: 35204309
I'm calling the function as such...but it's failing

CleanCSVFile ('C:\scripts\output\shortHRList.csv',',','C:\scripts\Output\output1.csv')

I'm probably calling it wrong
error.png
0
 
LVL 2

Author Comment

by:bndit
ID: 35204730
Ok, false alarm...I wasn't familiar with the way parameters are passed to functions in Powershell..and didn't know that parentheses actually evaluate the values to an expression....I changed my call to this:

CleanCSVFile "C:\scripts\output\shortHRList.csv" "," "C:\scripts\Output\output1.csv"

and it's working...thanks both for your feedback!
0
 
LVL 13

Expert Comment

by:soostibi
ID: 35205387
You have to adjust at the import-csv part the path and the delimiter (now it is set to TAB character).
Import-Csv C:\yorfile.csv -Delimiter "`t" | Select-Object number1, number2, "first name", @{n="Last Name"; 
    e = {$_."Last Name" -replace "^(\w+).*",'$1'}}

Open in new window

0
 
LVL 13

Expert Comment

by:soostibi
ID: 35205392

Import-Csv C:\yorfile.csv -Delimiter "`t" | Select-Object number1, number2, "first name", @{n="Last Name";   
    e = {$_."Last Name" -replace "^(\w+).*",'$1'}} | export-csv c:\newfile.csv -notypeinformation

Open in new window

0
 
LVL 2

Author Closing Comment

by:bndit
ID: 35210098
Good answers.
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Utilizing an array to gracefully append to a list of EmailAddresses
In previous parts of this Nano Server deployment series, we learned how to create, deploy and configure Nano Server as a Hyper-V host. In this part, we will look for a clustering option. We will create a Hyper-V cluster of 3 Nano Server host nodes w…
Exchange organizations may use the Journaling Agent of the Transport Service to archive messages going through Exchange. However, if the Transport Service is integrated with some email content management application (such as an antispam), the admini…
There are cases when e.g. an IT administrator wants to have full access and view into selected mailboxes on Exchange server, directly from his own email account in Outlook or Outlook Web Access. This proves useful when for example administrator want…

695 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question