Solved

Automated Script to compare Two Text Files and Save a copy of Only Differences

Posted on 2013-11-08
10
52 Views
1 Endorsement
Last Modified: 2016-07-10
My dearest Experts,

I want to compare two plain text files.  original.txt and new.txt
original.txt would be a full Customer list from a client a day old, and new.txt will be a full Customer list from a client from today.  I want to generate a script that will look at these two on a daily basis and save a copy of only the data in new.txt that did not exactly exist in original.txt to a file names diff.txt

Example:

original.txt
One
Two
Three
Five
Six
Seven
Eight

Open in new window


new.txt
One
Two
Three
Four
Five
Six
Seven
Eight
Nine
Ten
Eleven

Open in new window


diff.txt
Four
Nine
Ten
Eleven

Open in new window


Is this at all possible?  I see plenty of option on comparing text with other applications, but I want to do this automatically on a scheduled basis every day.  

Also, please keep in mind that my sample is nothing compared to what I'm comparing.  The data files a full customer demographics, and the files are 60,000+ lines of text (comprised of "~" delimited data).

Thank you in advance.

-Nick
1
Comment
Question by:NCollinsBBP
  • 3
  • 2
  • 2
  • +1
10 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 39633301
sort original.txt > oringinal.sort
sort new.txt > new.sort
comm -13 oringinal.sort new.sort > diff.txt
0
 
LVL 84

Expert Comment

by:ozo
ID: 39633315
Or, if diff.txt needs to keep the data in the same order as they appeared in new.txt:

perl -ne 'print if !$s{$_}++ && !@ARGV' original.txt new.txt > diff.txt
0
 
LVL 2

Expert Comment

by:burnocrash
ID: 39633353
if you wanna do in powershell.

here is the script,

compare-object -ReferenceObject $(Get-Content .\original.txt) -DifferenceObject $(Get-Content new.txt) > diff.txt
0
 

Author Comment

by:NCollinsBBP
ID: 39633475
@ozo, I do not have the liberty to utilize Perl on my current environment.  

@burnocrash, I have run the following script on my end in PowerShell...

compare-object -ReferenceObject $(Get-Content C:\test\old.txt) -DifferenceObject $(Get-Content new.txt) > C:\test\diff.txt  

Now, I get what I believe is the correct # of lines, but I do not see what I think I should see...

I'm getting in diff.txt a blank line at top, then two headers of "Input Object" as well as "SideIndicator", then my results.  But, I only get the first 56 characters of the line, followed by "...   =>"  

Is it possible to get diff.txt to show JUST the difference results in full?

-Nick
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 2

Accepted Solution

by:
burnocrash earned 500 total points
ID: 39638197
compare-object -ReferenceObject $(Get-Content .\original.txt) -DifferenceObject $(Get-Content new.txt) | select Inputobject | format-table -Wrap
0
 

Author Comment

by:NCollinsBBP
ID: 39638562
@burnocrash
Success!  (In regards to the output in the PowerShell screen).  Can this be spit out into the "diff.txt" file?  

My reason on doing this is that I receive a full customer file each and every day from a client, which has 60,000+ rows in, where only 75 to 100 of the lines are either updated or brand new.  Importing each of these daily is just killing my processing with the duplicates.  I can save hours in processing if I can just get the differences / new items spit out.  (And the client will not give the resources to change the customer extract... which is why I'm in this boat)

-Nick
0
 
LVL 2

Assisted Solution

by:burnocrash
burnocrash earned 500 total points
ID: 39640926
just add diff.txt to it.

here is the code,

compare-object -ReferenceObject $(Get-Content .\original.txt) -DifferenceObject $(Get-Content new.txt) | select Inputobject | format-table -Wrap > diff.txt

Enjoy :-)
0
 
LVL 11

Expert Comment

by:tel2
ID: 41702140
I suggest https:#a39640926 be accepted as the answer, as I see no reason to believe it didn't finish off the job.

Too bad the asker didn't specify the OS in the first place.  Would have saved ozo from wasting his time on it.
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Suggested Solutions

RIA (Rich Internet Application) tools are interactive internet applications which have many of the characteristics of desktop applications. The RIA tools typically deliver output either by the way of a site-specific browser or via browser plug-in. T…
This is an explanation of a simple data model to help parse a JSON feed
An introduction to basic programming syntax in Java by creating a simple program. Viewers can follow the tutorial as they create their first class in Java. Definitions and explanations about each element are given to help prepare viewers for future …
Viewers will learn how to properly install Eclipse with the necessary JDK, and will take a look at an introductory Java program. Download Eclipse installation zip file: Extract files from zip file: Download and install JDK 8: Open Eclipse and …

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now