• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 212
  • Last Modified:

Comparing text documents (originally HTML docs, but saved to text files)

I'm comparing two text documents with different text flows and embedded hard returns (0x0D 0x0A).

There are numerous sections where file A might read
   The quick brown fox
   jumped over the
   lazy dog
and file B might read
   The quick brown fox jumped
   over the lazy dog.

Converted to a canonical form, these two sentences are identical, but my comparison tool (Beyond Compare) shows differences due to the hard returns.

I could write something to trim the hard returns, but am hoping there is an easier way.  There will be insertions, deletions, and changes between the two files.

Any ideas?  I'm on Windows 7, have Visual Studio 2008, have Beyond Compare.
0
josgood
Asked:
josgood
  • 6
  • 4
1 Solution
 
ICaldwellCommented:
Use this.... http://diffuse.sourceforge.net/

There is a windows version & Linux version..... Its Free and works great
0
 
ICaldwellCommented:
Here is a website showing all the different free ones if you would like to check out other ones:

http://www.thefreecountry.com/programming/filecomparison.shtml
0
 
josgoodAuthor Commented:
Thanks for the ideas!  I'll check it/them out.
0
The 14th Annual Expert Award Winners

The results are in! Meet the top members of our 2017 Expert Awards. Congratulations to all who qualified!

 
josgoodAuthor Commented:
I'm running Windows, not Linux, so I don't have a ready way to handle deb files.

I'll take a look at the other link you suggested.
0
 
ICaldwellCommented:
0
 
josgoodAuthor Commented:
Thank you.  I have Diffuse installed and I'll take a look at it.
0
 
josgoodAuthor Commented:
Well, Diffuse has the same problem as Beyond Compare...embedded hard newlines (0x0D 0x)A in Windows) are honored, even with end of line characters and whitespace ignored.

Diffuse looks like a good tool, but its not doing the job I want.
0
 
ICaldwellCommented:
I think your going to run into this with every text comparer unless you go to a hex/binary comparison.... White Space can be checked with some programs but its default not to compare hard new lines unless your in hex/binary mode...

If you do go to Hex/Binary mode your going to have problems viewing the data the same way your use to...
0
 
josgoodAuthor Commented:
I think you're right.  I was hoping to simply use a tool that already exists, without needing to write anything.

I appreciate your help.
0
 
josgoodAuthor Commented:
Thank you.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: Amazon Web Services - Basic

Are you thinking about creating an Amazon Web Services account for your business? Not sure where to start? In this course you’ll get an overview of the history of AWS and take a tour of their user interface.

  • 6
  • 4
Tackle projects and never again get stuck behind a technical roadblock.
Join Now