Use PowerShell to merge info from two different CSV files into a new CSV file

Posted on 2010-08-27
I have a CSV file that contains this information (approx 10k entries)
EMPLOYEE_NUMBER, FIRST_NAME,MIDDLE_NAME,LAST_NAME,TITLE (some other columns..)
12345,John,K.,Doe,Manager
23425,Jane,,Doe,Director
23434,James,L.,Doe,Clerk
44444,David,,Smith,Sales Rep
,Robert,Jones,Sales Rep

I have a second CSV file that contains this information (approx 50 entries)
Name,NTAccountName,employeeNumber
John K. Doe,domain\John.Doe,12345
Jane Doe, domain\Jane.Doe,23425,

Goal: to merge the employee's title information from the first CSV file with the second file. So the output should be this:

Name,NTAccountName,employeeNumber,Title,
John K. Doe,domain\John.Doe,12345,Manager
Jane Doe, domain\Jane.Doe,23425,Director

I found on the Internet, but I'm having a hard time tweaking it to do what I want. Here's the post that I found. Since I have a common field in both CSV, I'd think I could use the employeeNumber to do the match.

# Author: Shay Levy

$csv1 = import-csv D:\Scripts\temp\logonInfo1.csv$csv2 = import-csv D:\Scripts\temp\logonInfo2.csv

$result1 =$(compare-object $csv1$csv2 -IncludeEqual | where {$_.SideIndicator -eq "=="} | foreach {$_.InputObject})
$result2 =$(compare-object $csv2$csv1 -IncludeEqual | where {$_.SideIndicator -eq "=="} | foreach {$_.InputObject})

$members1 =$csv1 | gm -mem  NoteProperty |  foreach {$_.name}$members2 = $csv2 | gm -mem NoteProperty | foreach {$_.name}

$newMembers =$(compare-object $members1$members2 | where {$_.SideIndicator -eq "=>"} | foreach {$_.InputObject})

for($i=0;$i -lt $newMembers.length;$i++){
$prop =$newMembers[$i]$result1 | add-member noteproperty $prop$null

for($x=0;$x -lt $result1.length;$x++){
$newMembers | foreach {$result1[$x].$prop = $result2[$x].$prop } } }$result1 | export-csv D:\Scripts\temp\new.csv -NoTypeInformation
get-content D:\Scripts\temp\new.csv

Question by:bndit
Author Comment

Here's another possible solution (again from Shay), but it's not working for me

Accepted Solution

This should work.

Chris

Chris
$CSV1 = Import-Csv "First.csv"$CSV2 = Import-Csv "Second.csv"

# Since the second is smaller we will loop through the first once and use Where-Object
# to find entries in the second

$CSV1 | ForEach-Object { # Store the EmployeeNumber from the first CSV$EmployeeNumber = $_.EMPLOYEE_NUMBER # And pull out the title field$Title = $_.TITLE # Find the corresponding entry in CSV 2 and prepare some output # Making a new field for Title and filling it with the title from CSV1.$CSV2 |
Where-Object { $_.employeeNumber -eq$EmployeeNumber } |
Select-Object Name, NTAccountName, EmployeeNumber,
LVL 13

Expert Comment
Import-Csv c:\ee\first.csv | %{
$title =$_.title
$secondinfo.($_.employee_number) | Select-Object name, ntaccountname, employeenumber,
LVL 71

Expert Comment
Import-Csv c:\ee\first.csv | %{
$name =$_.first_name + " " + @(if($_.middle_name){$_.middle_name+" "}) + $_.last_name$Title = $_.title$emplnum = $_.employee_number$secondinfo.($name) | Select-Object name, ntaccountname, @{n="employeenumber"; e={$emplnum}},
LVL 13

Assisted Solution
Import-Csv c:\ee\second.csv |
Select-Object name, ntaccountname, employeenumber,
@{n="title"; e={$firstinfo.($_.employeenumber).item(0).title}} |
Export-Csv c:\ee\new.csv -NoTypeInformation

LVL 2

Author Comment

ID: 33554580
@soostibi
That works! Thanks. Now, could comment the script so that I know the logic a bit better. I kinda get parts of it but I'd like to fully understand it.  Thanks again.
Expert Comment

1: I import the first csv and convert it to a hashtable at once by the employeenumber attribute. Actually this is a Group-object cmdlet, but as the employeenumber is unique, it actually makes a hashtable that has an element of the csv in each 'row' referenced by the employeenumber. I use this as there is no faster way to do it.
2: I import the second csv and pipe it into a
3: select-object cmdlet, to select the relevant properties and to create a new one, that is
4: the 'title' property, that come from addressing the item from the hashtable by the employeenumber of the actual element of the second csv that comes through the pipeline. As I mentioned, the hashtable is not a simple hashtable because of the group-object, so I have to grab the real information that is wrapped into a parametrized property named 'item'.
5: what borns after this is exported to a new csv.
LVL 2

Author Comment

ID: 33554660
Awesome, thanks both of you for your help. I think the fair thing to do is for me to split the points as you both helped me solve this problem.
Excellent assistance.
