troubleshooting Question

Powershell Script to sort and compare file contents

Avatar of Julie Kurpa
Julie KurpaFlag for United States of America asked on
Powershell
6 Comments1 Solution60 ViewsLast Modified:
On a daily basis, I want to compare two text files and place the differences between them into a 3rd text file.

The text files are comma delimited extracts from an Oracle DB which contain personnel data.  One file will be from the previous day.  The 2nd file will be the newest one with data changes.

I want my 3rd file to contain only the changed rows and new rows.

Some things to know about the data:
  1. The first column "EMPLOYEE_ID" will never change for existing records.  But new rows will be assigned a new EMPLOYEE_ID value.
  2. The headers for the data elements will not have double quotes.  However the data itself will include double quotes. I need the 1st line of headers included in the 3rd file creation.
  3. Yes I know that "EMPL_SUFFIX_CD" is in there twice.  I have no control over this. :(
  4. Each file, each day, will have headers at the top but the data itself will be sorted differently each day.  I have no control over this.

I have powerscript code put together, however the part that compares is not giving me what I expect.

My script does this:
1. Rename the newest file to append current date.  For the example data below, I've just named them "yesterdaydata" and "todaydata".
2. sort both text files by the 1st column and keep headers the top.  I'm assuming this will make it easier to compare the two files. I can't get this to work.
3a. check each line of file 1 (yesterday) with file 2 (today) and send any EMPLOYEE_ID data that has changed from file 2 (today) to the 3rd file.  <- this not working
3b. Also check if there are any new lines of data added (new EMPLOYEE_IDs) to File 2 (today) and put them in the 3rd file as well.  <-- not working
 


Example of contents:

yesterdayfile contents:

EMPLOYEE_ID,EMPL_LAST_NM,EMPL_FIRST_NM,EMPL_MIDDLE_NM,EMPL_SUFFIX_CD,EMAIL_ADDR,HOME_DEPT_CD,DEPT_NM,HOME_UNIT_CD,UNIT_NM,STR_1_NM,STR_2_NM,CITY_NM,ZIP,PH_NO,TITLE_CD,TITL_LONG_DD,EMPL_SUFFIX_CD,INT_RPT_POSN_ID_NO,EMPLOYEE_STATUS
"0000000002","CHAMBERS","DIANE","S","","","400","Office Of The Upity","xxxx","Book Reader","111 Broad St","","NY","55555","5555555555","0000","SR. SUPERVISOR                                                                                ","","","Inactive"
"0000001012","PETERSON","NORM","P","","","600","Office of Cheers","xxxx","Beer Taster","112 Broad St","","NY","55555","5555555555","0000","JR. ACCOUNTANT                                                                                ","","","Inactive"
"0000004403","CLAVIN","CLIFF","Q","","","500","Office Of The Mail","xxxx","Stamp Inspector","39 Broad St","","NY","55555","5555555555","0000","SR. INSPECTOR                                                                                ","","","Inactive"


todayfile contents:

EMPLOYEE_ID,EMPL_LAST_NM,EMPL_FIRST_NM,EMPL_MIDDLE_NM,EMPL_SUFFIX_CD,EMAIL_ADDR,HOME_DEPT_CD,DEPT_NM,HOME_UNIT_CD,UNIT_NM,STR_1_NM,STR_2_NM,CITY_NM,ZIP,PH_NO,TITLE_CD,TITL_LONG_DD,EMPL_SUFFIX_CD,INT_RPT_POSN_ID_NO,EMPLOYEE_STATUS
"0000000002","CHAMBERS","DIANE","S","","","400","Office Of The Upity","xxxx","Book Reader","111 Broad St","","NY","55555","5555555555","0000","SR. SUPERVISOR                                                                                ","","","Active"
"0000001012","PETERSON","NORM","P","","","600","Office of Cheers","xxxx","Beer Taster","112 Broad St","","NY","55555","5555555555","0000","JR. ACCOUNTANT                                                                                ","","","Inactive"
"0000004403","CLAVICAL","CLIFF","Q","","","500","Office Of The Mail","xxxx","Stamp Inspector","39 Broad St","","NY","55555","5555555555","0000","SR. INSPECTOR                                                                                ","","","Inactive"
"0000009901","LEBECK","CARLA","B","","","900","Office Of The Earings","xxxx","Earring Tester","14 Broad St","","NY","55555","5555555555","0000","SR. FUNKY DRESSER                                                                                ","","","Inactive"


The differences between the files are:
CLAVIN changed to CLAVICAL for Employee_id 0000004403
New employee LEBECK
Employee "0000000002" changed to "Active"

What I want to show in the 3rd file is the following (not in any particular order but need headers at the top):

EMPLOYEE_ID,EMPL_LAST_NM,EMPL_FIRST_NM,EMPL_MIDDLE_NM,EMPL_SUFFIX_CD,EMAIL_ADDR,HOME_DEPT_CD,DEPT_NM,HOME_UNIT_CD,UNIT_NM,STR_1_NM,STR_2_NM,CITY_NM,ZIP,PH_NO,TITLE_CD,TITL_LONG_DD,EMPL_SUFFIX_CD,INT_RPT_POSN_ID_NO,EMPLOYEE_STATUS
"0000000002","CHAMBERS","DIANE","S","","","400","Office Of The Upity","xxxx","Book Reader","111 Broad St","","NY","55555","5555555555","0000","SR. SUPERVISOR                                                                                ","","","Active"
"0000004403","CLAVICAL","CLIFF","Q","","","500","Office Of The Mail","xxxx","Stamp Inspector","39 Broad St","","NY","55555","5555555555","0000","SR. INSPECTOR                                                                                ","","","Inactive"
"0000009901","LEBECK","CARLA","B","","","900","Office Of The Earings","xxxx","Earring Tester","14 Broad St","","NY","55555","5555555555","0000","SR. FUNKY DRESSER                                                                                ","","","Inactive"


Here's the code I have so far:


$workingDir = "C:\PS_Scripts\"

# #################################################
# Obtain file name from yesterday employee file.
# #################################################
$YesterdayDate = Get-Date ((Get-Date).AddDays(-1)) -Format yyyyMMdd
$YesterdayFile = $workingDir + (Get-ChildItem C:\PS_Scripts\ -Filter "employees-$YesterdayDate*")

"Yesterday file is: $YesterdayFile"


# #################################################
# Rename todays employee file to append todays date
# #################################################

$fileName = "C:\PS_Scripts\employees.txt"

# Check the file exists
if (-not(Test-Path $fileName)) {break}

# Display the original name
"Original filename: $fileName"

$fileObj = get-item $fileName

# Get the date
$DateStamp = get-date -uformat "%Y%m%d"

$extOnly = $fileObj.extension

if ($extOnly.length -eq 0) {
   $nameOnly = $fileObj.Name
   rename-item "$fileObj" "$nameOnly-$DateStamp"
   }
else {
   $nameOnly = $fileObj.Name.Replace( $fileObj.Extension,'')
   rename-item "$fileName" "$nameOnly-$DateStamp$extOnly"
   }

$TodayFile = $workingDir + (Get-ChildItem C:\PS_Scripts\ -Filter "employees-$DateStamp*")

"TodayFile is:  $TodayFile"


# ##########################################################
# Sort the two files (or just the new one assuming we
#                     did the other one yesterday)

# ##########################################################

(Nothing I've tried so far for sorting has worked so I have no code to show).  
(For some attempts I get an error that a column exists twice in the data but I have no control over this)
   
# ##########################################################
# Compare yesterday employee list with todays employee list.
# Place differences into new employee text file
#
#  (This is not working either even after manually sorting the files content)
# ##########################################################

get-content $TodayFile |
where {$YesterdayFile -notcontains $_} |
set-content C:\PS_Scripts\diff.txt

# place first line (headers) into final file
get-content $TodayFile -First 1  > C:\PS_Scripts\employees_final.txt
get-content C:\PS_Scripts\diff.txt >> C:\PS_Scripts\employees_final.txt
ASKER CERTIFIED SOLUTION
Join our community to see this answer!
Unlock 1 Answer and 6 Comments.
Start Free Trial
Learn from the best

Network and collaborate with thousands of CTOs, CISOs, and IT Pros rooting for you and your success.

Andrew Hancock - VMware vExpert
See if this solution works for you by signing up for a 7 day free trial.
Unlock 1 Answer and 6 Comments.
Try for 7 days

”The time we save is the biggest benefit of E-E to our team. What could take multiple guys 2 hours or more each to find is accessed in around 15 minutes on Experts Exchange.

-Mike Kapnisakis, Warner Bros