Solved

How can I compare several text files and keep only 1 copy of a file without duplication?

Posted on 2014-09-24
9
132 Views
Last Modified: 2014-10-01
I would like to compare several text files, ie new1.txt and new2.txt, new3.txt .. regardless of the name of the files.
If the 'content's of the files compared with others is the same then delete one of the files which is a duplicate.
0
Comment
Question by:100questions
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
  • 2
9 Comments
 
LVL 17

Expert Comment

by:Emmanuel Adebayo
ID: 40342088
Hi,

You can use Multi-File Compare

To download files and see other information about the project, go to http://sourceforge.net/projects/multi-fcompare.

Rgds
0
 

Author Comment

by:100questions
ID: 40342173
Will this work in an existing batch script?
0
 
LVL 17

Expert Comment

by:Emmanuel Adebayo
ID: 40342192
No this is an executable.
0
Is Your DevOps Pipeline Leaking?

Is your CI/CD pipeline a hodge-podge of randomly connected tools? You’ve likely got a tool to fix one problem & then a different tool to fix another, resulting in a cluster of tools with overlapping functionality. Learn how to optimize your pipeline with Gartner's recommendations

 

Author Comment

by:100questions
ID: 40342243
Then I would need something that I can insert in an existing Windows Batch file or a new Powershell or VBScript which can perform the function.
0
 
LVL 9

Accepted Solution

by:
dlb6597 earned 500 total points
ID: 40342313
barebones, inefficient...definately test this with a subset of your data.
basically for every text file this launches another for loop that compares each .txt file to every other .txt file and deletes if there is a match. The script starts over after a deletion because the (*.txt) set changes...

:start
for %%i in (*.txt) do (
	for %%j in (*.txt) do (
	if not "%%i" == "%%j" fc %%i %%j && del %%i && goto start
)
)

Open in new window

0
 

Author Comment

by:100questions
ID: 40342336
Does this script look into the contents of the txt file?
0
 
LVL 9

Expert Comment

by:dlb6597
ID: 40342351
yes, it compares file contents using the fc command.
0
 

Author Comment

by:100questions
ID: 40344239
This seems to work, however the problem is that one of the files it compares contains a small right arrow at the end of the data (an ASCII EOF marker) and if it sees that then it does not deduplicate properly.  

Is there a way your script can be modified so as to ignore the an ASCII EOF marker?
0
 
LVL 9

Expert Comment

by:dlb6597
ID: 40344258
then the files aren't identical then are they?  There is a /L parameter for FC, but I doubt it will make any difference since the files are truly different.
0

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

AutoHotkey is an excellent, free, open source programming/scripting language for Windows. It started out as a keyboard/mouse macros product, but has expanded into a robust language. This article provides an introduction to it, with links to addition…
How to remove superseded packages in windows w60 or w61 installation media (.wim) or online system to prevent unnecessary space. w60 means Windows Vista or Windows Server 2008. w61 means Windows 7 or Windows Server 2008 R2. There are various …
Nobody understands Phishing better than an anti-spam company. That’s why we are providing Phishing Awareness Training to our customers. According to a report by Verizon, only 3% of targeted users report malicious emails to management. With compan…
A short tutorial showing how to set up an email signature in Outlook on the Web (previously known as OWA). For free email signatures designs, visit https://www.mail-signatures.com/articles/signature-templates/?sts=6651 If you want to manage em…

734 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question