<

How to find and eliminate duplicate files on your Windows systems.

Posted on
6,524 Points
324 Views
2 Endorsements
Last Modified:
Experience Level: Intermediate
5:34
Ed Covney
Retired USN in '88. Then IT & s/w dev. Fully retired in 2015. Now practicing math skills, long neglected, and learning VBA (to demo math).
Finding and deleting duplicate (picture) files can be a time consuming task. My wife and I, our three kids and their families all share one dilemma: Managing our pictures. Between desktops, laptops, phones, tablets, and cameras; over the last decade or two, we've accumulated and shared many 10s of thousands of pictures. And in an effort to begin managing them, it seemed that finding and deleting duplicates was a pretty good start. Easier said than done.

About 8 or 9 years ago, I wrote a program that enumerated all files of a type (extension) in any drive (or folder). Once all the file names are "listed", I added a section to then get an MD5 hash of each file. As each file is hashed, it writes 4 pieces of data to a tab (or comma) delimited text file:
32h byte hash,  file size,  file name,  full file name.

Video Steps

1. Select Drive

When you run DupeFF.exe, all available drive letters will be listed down the left side - choose one.

2. Select a Folder

You can search entire drives, or select a specific folder. Whatever you select will be searched entirely including all sub-folders.

3. Select a File Type

You can search for any file type. If you want to select ALL files (all types),
click the last type "*" listed. Or select any other type by entering the extension where the "*" is. To select any excel spreadsheet type, enter "xl*" (it will find xls, xlsm, xlsb, etc.)

4. Click the #4 Enumerate File List button

The program will list all the files in finds based on the criteria you previously provided. By default, the final report text file will be "Tab Delimited". If you prefer, click on the "Commas" check box.  Also by default, only the first 64K characters of files are actually hashed. Again, if you desire, check the "Hash Full File Content" instead.

5. Hash all enumerated files

 Click the #5 "Hash Enumerated File List" button. When complete, it creates a "txtx" text file. Note, the extra x. You probably won't have a "txtx" associated with a program, as it can be very large file. If it's a small file, let notepad open it, else open it with wordpad or word. You can also import it directly into Excel but in the end no matter where it is, we want to place a full copy into the clipboard.

6. Open DupeFF.xlsm

Once open, review the information on the "Instructions" tab. Taking the steps recommended, you'll be able to preview duplicates and once you're convinced duplicates are REALLY duplicates, you can easily delete even thousands of them in seconds.

EE allows me to attach the spread sheet  "DupeFF.xlsm"  that I use in the video, but not the program. For those who'd like a copy of the DupeFF.exe program or its Delphi XE2 source code, please contact me directly:
ed(dot)covney(at)gmail(dot)com.
 DupeFF.xlsm
 
2
Author:Ed Covney
0 Comments

Featured Post

Ensure you’re charging the right price for your IT

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

This article is about an excellent bookmarks manager I've been using that I thought deserved some promotion to make more folks aware of its existence. Free to use and very useful indeed. Enjoy...
This is my 100th publication at EE — 56 articles and 44 videos. To mark the occasion, I wrote a program to download the Title, Views, Endorsements, and Points for the specified URLs of articles and videos. Based on feedback, I enhanced the program t…

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month