Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Compare CSV files using Powershell

Posted on 2013-11-13
4
Medium Priority
?
488 Views
Last Modified: 2013-11-13
I have two very large CSV files both with the following headings 'Name', FullName','Length'.
What I am looking for is a way to compare these CSV files in Powershell.

I need to know...

1) What files are unique to each set (Based purely on Name).
2) What files exist in both but whos lengths are different (Based on Name and Length).

Basically a report on what changes are represented between the two CSV files... If the results could be output to seperate files representing each type of difference that would be ideal?

This is to solve a problem where a new version of software has grown rapidly in size but the cause of the growth is unknown.
0
Comment
Question by:Blowfelt82
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 41

Expert Comment

by:footech
ID: 39644221
Let me know how the below works for you.  The output files only include the file names.  If you need it to be something else, let me know if you need help making the adjustment to the code.  Hope that all files names are unique in each .CSV, otherwise this won't work.
$file1 = Import-CSV file1.csv
$file2 = Import-CSV file2.csv
Compare-Object $file1 $file2 -Property Name | Select -ExpandProperty Name | Out-File UniqueFiles.txt
($file1 + $file2) | Group -Property Name |
 ? { $_.count -eq 2 } |
 % { Compare-Object ($_.group)[0] ($_.group)[1] -property Name,Length -passthru } |
 Select -ExpandProperty Name -ExcludeProperty SideIndicator -Unique |
 Out-File ChangedSize.txt

Open in new window

0
 

Author Comment

by:Blowfelt82
ID: 39644296
Its basically a comparison of a c:\ drive exported from a wim file, so there may well be duplicated names... Perhaps using the fullpath field would give greater acuracy?
0
 
LVL 41

Accepted Solution

by:
footech earned 2000 total points
ID: 39645338
Using the fullname would avoid errors.  All you would have to do to modify the script is change each instance of "Name" to "FullName".
0
 

Author Closing Comment

by:Blowfelt82
ID: 39645462
Thanks again for your help
0

Featured Post

Are You Ready for GDPR?

With the GDPR deadline set for May 25, 2018, many organizations are ill-prepared due to uncertainty about the criteria for compliance. According to a recent WatchGuard survey, a staggering 37% of respondents don't even know if their organization needs to comply with GDPR. Do you?

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The Windows functions GetTickCount and timeGetTime retrieve the number of milliseconds since the system was started. However, the value is stored in a DWORD, which means that it wraps around to zero every 49.7 days. This article shows how to solve t…
Originally, this post was published on Monitis Blog, you can check it here . In business circles, we sometimes hear that today is the “age of the customer.” And so it is. Thanks to the enormous advances over the past few years in consumer techno…
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…
In a recent question (https://www.experts-exchange.com/questions/29004105/Run-AutoHotkey-script-directly-from-Notepad.html) here at Experts Exchange, a member asked how to run an AutoHotkey script (.AHK) directly from Notepad++ (aka NPP). This video…

715 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question