Powershell Script to sort and compare file contents

Julie Kurpa
Julie Kurpa used Ask the Experts™
on
On a daily basis, I want to compare two text files and place the differences between them into a 3rd text file.

The text files are comma delimited extracts from an Oracle DB which contain personnel data.  One file will be from the previous day.  The 2nd file will be the newest one with data changes.

I want my 3rd file to contain only the changed rows and new rows.

Some things to know about the data:
  1. The first column "EMPLOYEE_ID" will never change for existing records.  But new rows will be assigned a new EMPLOYEE_ID value.
  2. The headers for the data elements will not have double quotes.  However the data itself will include double quotes. I need the 1st line of headers included in the 3rd file creation.
  3. Yes I know that "EMPL_SUFFIX_CD" is in there twice.  I have no control over this. :(
  4. Each file, each day, will have headers at the top but the data itself will be sorted differently each day.  I have no control over this.

I have powerscript code put together, however the part that compares is not giving me what I expect.

My script does this:
1. Rename the newest file to append current date.  For the example data below, I've just named them "yesterdaydata" and "todaydata".
2. sort both text files by the 1st column and keep headers the top.  I'm assuming this will make it easier to compare the two files. I can't get this to work.
3a. check each line of file 1 (yesterday) with file 2 (today) and send any EMPLOYEE_ID data that has changed from file 2 (today) to the 3rd file.  <- this not working
3b. Also check if there are any new lines of data added (new EMPLOYEE_IDs) to File 2 (today) and put them in the 3rd file as well.  <-- not working
 


Example of contents:

yesterdayfile contents:

EMPLOYEE_ID,EMPL_LAST_NM,EMPL_FIRST_NM,EMPL_MIDDLE_NM,EMPL_SUFFIX_CD,EMAIL_ADDR,HOME_DEPT_CD,DEPT_NM,HOME_UNIT_CD,UNIT_NM,STR_1_NM,STR_2_NM,CITY_NM,ZIP,PH_NO,TITLE_CD,TITL_LONG_DD,EMPL_SUFFIX_CD,INT_RPT_POSN_ID_NO,EMPLOYEE_STATUS
"0000000002","CHAMBERS","DIANE","S","","","400","Office Of The Upity","xxxx","Book Reader","111 Broad St","","NY","55555","5555555555","0000","SR. SUPERVISOR                                                                                ","","","Inactive"
"0000001012","PETERSON","NORM","P","","","600","Office of Cheers","xxxx","Beer Taster","112 Broad St","","NY","55555","5555555555","0000","JR. ACCOUNTANT                                                                                ","","","Inactive"
"0000004403","CLAVIN","CLIFF","Q","","","500","Office Of The Mail","xxxx","Stamp Inspector","39 Broad St","","NY","55555","5555555555","0000","SR. INSPECTOR                                                                                ","","","Inactive"


todayfile contents:

EMPLOYEE_ID,EMPL_LAST_NM,EMPL_FIRST_NM,EMPL_MIDDLE_NM,EMPL_SUFFIX_CD,EMAIL_ADDR,HOME_DEPT_CD,DEPT_NM,HOME_UNIT_CD,UNIT_NM,STR_1_NM,STR_2_NM,CITY_NM,ZIP,PH_NO,TITLE_CD,TITL_LONG_DD,EMPL_SUFFIX_CD,INT_RPT_POSN_ID_NO,EMPLOYEE_STATUS
"0000000002","CHAMBERS","DIANE","S","","","400","Office Of The Upity","xxxx","Book Reader","111 Broad St","","NY","55555","5555555555","0000","SR. SUPERVISOR                                                                                ","","","Active"
"0000001012","PETERSON","NORM","P","","","600","Office of Cheers","xxxx","Beer Taster","112 Broad St","","NY","55555","5555555555","0000","JR. ACCOUNTANT                                                                                ","","","Inactive"
"0000004403","CLAVICAL","CLIFF","Q","","","500","Office Of The Mail","xxxx","Stamp Inspector","39 Broad St","","NY","55555","5555555555","0000","SR. INSPECTOR                                                                                ","","","Inactive"
"0000009901","LEBECK","CARLA","B","","","900","Office Of The Earings","xxxx","Earring Tester","14 Broad St","","NY","55555","5555555555","0000","SR. FUNKY DRESSER                                                                                ","","","Inactive"


The differences between the files are:
CLAVIN changed to CLAVICAL for Employee_id 0000004403
New employee LEBECK
Employee "0000000002" changed to "Active"

What I want to show in the 3rd file is the following (not in any particular order but need headers at the top):

EMPLOYEE_ID,EMPL_LAST_NM,EMPL_FIRST_NM,EMPL_MIDDLE_NM,EMPL_SUFFIX_CD,EMAIL_ADDR,HOME_DEPT_CD,DEPT_NM,HOME_UNIT_CD,UNIT_NM,STR_1_NM,STR_2_NM,CITY_NM,ZIP,PH_NO,TITLE_CD,TITL_LONG_DD,EMPL_SUFFIX_CD,INT_RPT_POSN_ID_NO,EMPLOYEE_STATUS
"0000000002","CHAMBERS","DIANE","S","","","400","Office Of The Upity","xxxx","Book Reader","111 Broad St","","NY","55555","5555555555","0000","SR. SUPERVISOR                                                                                ","","","Active"
"0000004403","CLAVICAL","CLIFF","Q","","","500","Office Of The Mail","xxxx","Stamp Inspector","39 Broad St","","NY","55555","5555555555","0000","SR. INSPECTOR                                                                                ","","","Inactive"
"0000009901","LEBECK","CARLA","B","","","900","Office Of The Earings","xxxx","Earring Tester","14 Broad St","","NY","55555","5555555555","0000","SR. FUNKY DRESSER                                                                                ","","","Inactive"


Here's the code I have so far:


$workingDir = "C:\PS_Scripts\"

# #################################################
# Obtain file name from yesterday employee file.
# #################################################
$YesterdayDate = Get-Date ((Get-Date).AddDays(-1)) -Format yyyyMMdd
$YesterdayFile = $workingDir + (Get-ChildItem C:\PS_Scripts\ -Filter "employees-$YesterdayDate*")

"Yesterday file is: $YesterdayFile"


# #################################################
# Rename todays employee file to append todays date
# #################################################

$fileName = "C:\PS_Scripts\employees.txt"

# Check the file exists
if (-not(Test-Path $fileName)) {break}

# Display the original name
"Original filename: $fileName"

$fileObj = get-item $fileName

# Get the date
$DateStamp = get-date -uformat "%Y%m%d"

$extOnly = $fileObj.extension

if ($extOnly.length -eq 0) {
   $nameOnly = $fileObj.Name
   rename-item "$fileObj" "$nameOnly-$DateStamp"
   }
else {
   $nameOnly = $fileObj.Name.Replace( $fileObj.Extension,'')
   rename-item "$fileName" "$nameOnly-$DateStamp$extOnly"
   }

$TodayFile = $workingDir + (Get-ChildItem C:\PS_Scripts\ -Filter "employees-$DateStamp*")

"TodayFile is:  $TodayFile"


# ##########################################################
# Sort the two files (or just the new one assuming we
#                     did the other one yesterday)

# ##########################################################

(Nothing I've tried so far for sorting has worked so I have no code to show).  
(For some attempts I get an error that a column exists twice in the data but I have no control over this)
   
# ##########################################################
# Compare yesterday employee list with todays employee list.
# Place differences into new employee text file
#
#  (This is not working either even after manually sorting the files content)
# ##########################################################

get-content $TodayFile |
where {$YesterdayFile -notcontains $_} |
set-content C:\PS_Scripts\diff.txt

# place first line (headers) into final file
get-content $TodayFile -First 1  > C:\PS_Scripts\employees_final.txt
get-content C:\PS_Scripts\diff.txt >> C:\PS_Scripts\employees_final.txt
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
MichelangeloSystem Administrator / Postmaster

Commented:
You should definitely provide some more details on file contents and the code you have tried.
Julie KurpaSr. Systems Programmer

Author

Commented:
Michelangelo, I thought I did that.  What else should I provide?
Most Valuable Expert 2018
Distinguished Expert 2018
Commented:
The only things missing were what should happen if an employee was removed (or if that can happen at all), and what should happen if no changes were found (create no file, create an empty file, create a file with the header line only, ...)
Changes below line 3 should not be required, except for line 30 (see the comment):
$yesterdayPath = '.\yesterday.csv'
$todayPath = '.\today.csv'
$finalPath = 'C:\PS_Scripts\employees_final.txt'

$originalHeader = Get-Content -Path $yesterdayPath -First 1
$workHeader = ($originalHeader -replace 'EMPL_SUFFIX_CD(?!.*EMPL_SUFFIX_CD)', 'EMPL_SUFFIX_CD_2').Split(',')
$today = [ordered]@{}
$yesterday = [ordered]@{}
Get-Content -Path $yesterdayPath | Select-Object -Skip 1 | ConvertFrom-Csv -Header $workHeader | Sort-Object -Property EMPLOYEE_ID | ForEach-Object {$yesterday[$_.EMPLOYEE_ID] = $_}
Get-Content -Path $todayPath | Select-Object -Skip 1 | ConvertFrom-Csv -Header $workHeader | Sort-Object -Property EMPLOYEE_ID | ForEach-Object {$today[$_.EMPLOYEE_ID] = $_}
$differences = $(
	$today.Values | ForEach-Object {
		Write-Host "Processing '$($_.EMPLOYEE_ID)' ..." -ForegroundColor White -NoNewline
		If ($yesterday.Keys -contains $_.EMPLOYEE_ID) {
			If (Compare-Object -ReferenceObject $_ -DifferenceObject $yesterday[$_.EMPLOYEE_ID] -Property $workHeader) {
				Write-Host ' changed.' -ForegroundColor Yellow
				$_
			} Else {
				Write-Host ' no changes.' -ForegroundColor White
			}
		} Else {
			Write-Host ' added.' -ForegroundColor Green
			$_
		}
	}
	$yesterday.Values | Where-Object {$today.Keys -notcontains $_.EMPLOYEE_ID} | ForEach-Object {
		Write-Host "Processing '$($_.EMPLOYEE_ID)' ..." -ForegroundColor White -NoNewline
		Write-Host ' removed.' -ForegroundColor Red
		## Uncomment the following line to add removed employees to the output as well:
		# $_
	}
)
If ($differences) {
	$(
		$originalHeader
		$differences | Sort-Object -Property EMPLOYEE_ID | ConvertTo-Csv -NoTypeInformation | Select-Object -Skip 1
	) | Set-Content -Path $finalPath
} Else {
	Write-Host 'No differences found.'
}

Open in new window

C++ 11 Fundamentals

This course will introduce you to C++ 11 and teach you about syntax fundamentals.

Julie KurpaSr. Systems Programmer

Author

Commented:
Wow that's great oBdA!  I'm going through to see if I can make sense of it then I'll test it with the real data.  

As far as showing removed employees, I didn't think I needed to account for that since I went though the data and could see employees long retired were still in the list.  I think they are kept for historical purposes as this is related to work orders in our organization.  

I can certainly tweak this to handle if no changes were found (I hope).   Will let you know how the testing goes.
Julie KurpaSr. Systems Programmer

Author

Commented:
Absolutely amazing oBdA!  It's working beautifully!   You have saved me so much time and agony!
Julie KurpaSr. Systems Programmer

Author

Commented:
Absolutely amazing oBdA!  It's working beautifully!   You have saved me so much time and agony!

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial