Link to home
Start Free TrialLog in
Avatar of Julie Kurpa
Julie KurpaFlag for United States of America

asked on

Powershell Script to sort and compare file contents

On a daily basis, I want to compare two text files and place the differences between them into a 3rd text file.

The text files are comma delimited extracts from an Oracle DB which contain personnel data.  One file will be from the previous day.  The 2nd file will be the newest one with data changes.

I want my 3rd file to contain only the changed rows and new rows.

Some things to know about the data:
  1. The first column "EMPLOYEE_ID" will never change for existing records.  But new rows will be assigned a new EMPLOYEE_ID value.
  2. The headers for the data elements will not have double quotes.  However the data itself will include double quotes. I need the 1st line of headers included in the 3rd file creation.
  3. Yes I know that "EMPL_SUFFIX_CD" is in there twice.  I have no control over this. :(
  4. Each file, each day, will have headers at the top but the data itself will be sorted differently each day.  I have no control over this.

I have powerscript code put together, however the part that compares is not giving me what I expect.

My script does this:
1. Rename the newest file to append current date.  For the example data below, I've just named them "yesterdaydata" and "todaydata".
2. sort both text files by the 1st column and keep headers the top.  I'm assuming this will make it easier to compare the two files. I can't get this to work.
3a. check each line of file 1 (yesterday) with file 2 (today) and send any EMPLOYEE_ID data that has changed from file 2 (today) to the 3rd file.  <- this not working
3b. Also check if there are any new lines of data added (new EMPLOYEE_IDs) to File 2 (today) and put them in the 3rd file as well.  <-- not working
 


Example of contents:

yesterdayfile contents:

EMPLOYEE_ID,EMPL_LAST_NM,EMPL_FIRST_NM,EMPL_MIDDLE_NM,EMPL_SUFFIX_CD,EMAIL_ADDR,HOME_DEPT_CD,DEPT_NM,HOME_UNIT_CD,UNIT_NM,STR_1_NM,STR_2_NM,CITY_NM,ZIP,PH_NO,TITLE_CD,TITL_LONG_DD,EMPL_SUFFIX_CD,INT_RPT_POSN_ID_NO,EMPLOYEE_STATUS
"0000000002","CHAMBERS","DIANE","S","","","400","Office Of The Upity","xxxx","Book Reader","111 Broad St","","NY","55555","5555555555","0000","SR. SUPERVISOR                                                                                ","","","Inactive"
"0000001012","PETERSON","NORM","P","","","600","Office of Cheers","xxxx","Beer Taster","112 Broad St","","NY","55555","5555555555","0000","JR. ACCOUNTANT                                                                                ","","","Inactive"
"0000004403","CLAVIN","CLIFF","Q","","","500","Office Of The Mail","xxxx","Stamp Inspector","39 Broad St","","NY","55555","5555555555","0000","SR. INSPECTOR                                                                                ","","","Inactive"


todayfile contents:

EMPLOYEE_ID,EMPL_LAST_NM,EMPL_FIRST_NM,EMPL_MIDDLE_NM,EMPL_SUFFIX_CD,EMAIL_ADDR,HOME_DEPT_CD,DEPT_NM,HOME_UNIT_CD,UNIT_NM,STR_1_NM,STR_2_NM,CITY_NM,ZIP,PH_NO,TITLE_CD,TITL_LONG_DD,EMPL_SUFFIX_CD,INT_RPT_POSN_ID_NO,EMPLOYEE_STATUS
"0000000002","CHAMBERS","DIANE","S","","","400","Office Of The Upity","xxxx","Book Reader","111 Broad St","","NY","55555","5555555555","0000","SR. SUPERVISOR                                                                                ","","","Active"
"0000001012","PETERSON","NORM","P","","","600","Office of Cheers","xxxx","Beer Taster","112 Broad St","","NY","55555","5555555555","0000","JR. ACCOUNTANT                                                                                ","","","Inactive"
"0000004403","CLAVICAL","CLIFF","Q","","","500","Office Of The Mail","xxxx","Stamp Inspector","39 Broad St","","NY","55555","5555555555","0000","SR. INSPECTOR                                                                                ","","","Inactive"
"0000009901","LEBECK","CARLA","B","","","900","Office Of The Earings","xxxx","Earring Tester","14 Broad St","","NY","55555","5555555555","0000","SR. FUNKY DRESSER                                                                                ","","","Inactive"


The differences between the files are:
CLAVIN changed to CLAVICAL for Employee_id 0000004403
New employee LEBECK
Employee "0000000002" changed to "Active"

What I want to show in the 3rd file is the following (not in any particular order but need headers at the top):

EMPLOYEE_ID,EMPL_LAST_NM,EMPL_FIRST_NM,EMPL_MIDDLE_NM,EMPL_SUFFIX_CD,EMAIL_ADDR,HOME_DEPT_CD,DEPT_NM,HOME_UNIT_CD,UNIT_NM,STR_1_NM,STR_2_NM,CITY_NM,ZIP,PH_NO,TITLE_CD,TITL_LONG_DD,EMPL_SUFFIX_CD,INT_RPT_POSN_ID_NO,EMPLOYEE_STATUS
"0000000002","CHAMBERS","DIANE","S","","","400","Office Of The Upity","xxxx","Book Reader","111 Broad St","","NY","55555","5555555555","0000","SR. SUPERVISOR                                                                                ","","","Active"
"0000004403","CLAVICAL","CLIFF","Q","","","500","Office Of The Mail","xxxx","Stamp Inspector","39 Broad St","","NY","55555","5555555555","0000","SR. INSPECTOR                                                                                ","","","Inactive"
"0000009901","LEBECK","CARLA","B","","","900","Office Of The Earings","xxxx","Earring Tester","14 Broad St","","NY","55555","5555555555","0000","SR. FUNKY DRESSER                                                                                ","","","Inactive"


Here's the code I have so far:


$workingDir = "C:\PS_Scripts\"

# #################################################
# Obtain file name from yesterday employee file.
# #################################################
$YesterdayDate = Get-Date ((Get-Date).AddDays(-1)) -Format yyyyMMdd
$YesterdayFile = $workingDir + (Get-ChildItem C:\PS_Scripts\ -Filter "employees-$YesterdayDate*")

"Yesterday file is: $YesterdayFile"


# #################################################
# Rename todays employee file to append todays date
# #################################################

$fileName = "C:\PS_Scripts\employees.txt"

# Check the file exists
if (-not(Test-Path $fileName)) {break}

# Display the original name
"Original filename: $fileName"

$fileObj = get-item $fileName

# Get the date
$DateStamp = get-date -uformat "%Y%m%d"

$extOnly = $fileObj.extension

if ($extOnly.length -eq 0) {
   $nameOnly = $fileObj.Name
   rename-item "$fileObj" "$nameOnly-$DateStamp"
   }
else {
   $nameOnly = $fileObj.Name.Replace( $fileObj.Extension,'')
   rename-item "$fileName" "$nameOnly-$DateStamp$extOnly"
   }

$TodayFile = $workingDir + (Get-ChildItem C:\PS_Scripts\ -Filter "employees-$DateStamp*")

"TodayFile is:  $TodayFile"


# ##########################################################
# Sort the two files (or just the new one assuming we
#                     did the other one yesterday)

# ##########################################################

(Nothing I've tried so far for sorting has worked so I have no code to show).  
(For some attempts I get an error that a column exists twice in the data but I have no control over this)
   
# ##########################################################
# Compare yesterday employee list with todays employee list.
# Place differences into new employee text file
#
#  (This is not working either even after manually sorting the files content)
# ##########################################################

get-content $TodayFile |
where {$YesterdayFile -notcontains $_} |
set-content C:\PS_Scripts\diff.txt

# place first line (headers) into final file
get-content $TodayFile -First 1  > C:\PS_Scripts\employees_final.txt
get-content C:\PS_Scripts\diff.txt >> C:\PS_Scripts\employees_final.txt
Avatar of Michelangelo
Michelangelo
Flag of Italy image

You should definitely provide some more details on file contents and the code you have tried.
Avatar of Julie Kurpa

ASKER

Michelangelo, I thought I did that.  What else should I provide?
ASKER CERTIFIED SOLUTION
Avatar of oBdA
oBdA

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Wow that's great oBdA!  I'm going through to see if I can make sense of it then I'll test it with the real data.  

As far as showing removed employees, I didn't think I needed to account for that since I went though the data and could see employees long retired were still in the list.  I think they are kept for historical purposes as this is related to work orders in our organization.  

I can certainly tweak this to handle if no changes were found (I hope).   Will let you know how the testing goes.
Absolutely amazing oBdA!  It's working beautifully!   You have saved me so much time and agony!
Absolutely amazing oBdA!  It's working beautifully!   You have saved me so much time and agony!