<

How to find and eliminate duplicate files on your Windows systems.

Posted on
6,391 Points
291 Views
1 Endorsement
Last Modified:
Experience Level: Intermediate
5:34
Ed Covney
Retired USN in '88. Then IT & s/w dev. Fully retired in 2015. Now practicing math skills, long neglected, and learning VBA (to demo math).
Finding and deleting duplicate (picture) files can be a time consuming task. My wife and I, our three kids and their families all share one dilemma: Managing our pictures. Between desktops, laptops, phones, tablets, and cameras; over the last decade or two, we've accumulated and shared many 10s of thousands of pictures. And in an effort to begin managing them, it seemed that finding and deleting duplicates was a pretty good start. Easier said than done.

About 8 or 9 years ago, I wrote a program that enumerated all files of a type (extension) in any drive (or folder). Once all the file names are "listed", I added a section to then get an MD5 hash of each file. As each file is hashed, it writes 4 pieces of data to a tab (or comma) delimited text file:
32h byte hash,  file size,  file name,  full file name.

Video Steps

1. Select Drive

When you run DupeFF.exe, all available drive letters will be listed down the left side - choose one.

2. Select a Folder

You can search entire drives, or select a specific folder. Whatever you select will be searched entirely including all sub-folders.

3. Select a File Type

You can search for any file type. If you want to select ALL files (all types),
click the last type "*" listed. Or select any other type by entering the extension where the "*" is. To select any excel spreadsheet type, enter "xl*" (it will find xls, xlsm, xlsb, etc.)

4. Click the #4 Enumerate File List button

The program will list all the files in finds based on the criteria you previously provided. By default, the final report text file will be "Tab Delimited". If you prefer, click on the "Commas" check box.  Also by default, only the first 64K characters of files are actually hashed. Again, if you desire, check the "Hash Full File Content" instead.

5. Hash all enumerated files

 Click the #5 "Hash Enumerated File List" button. When complete, it creates a "txtx" text file. Note, the extra x. You probably won't have a "txtx" associated with a program, as it can be very large file. If it's a small file, let notepad open it, else open it with wordpad or word. You can also import it directly into Excel but in the end no matter where it is, we want to place a full copy into the clipboard.

6. Open DupeFF.xlsm

Once open, review the information on the "Instructions" tab. Taking the steps recommended, you'll be able to preview duplicates and once you're convinced duplicates are REALLY duplicates, you can easily delete even thousands of them in seconds.

EE allows me to attach the spread sheet  "DupeFF.xlsm"  that I use in the video, but not the program. For those who'd like a copy of the DupeFF.exe program or its Delphi XE2 source code, please contact me directly:
ed(dot)covney(at)gmail(dot)com.
 DupeFF.xlsm
 
1
Author:Ed Covney
0 Comments

Featured Post

Fundamentals of JavaScript

Learn the fundamentals of the popular programming language JavaScript so that you can explore the realm of web development.

I see people having issues with the Chrome browser so often that I decided to write this article on how to definitively determine problems you may be encountering with Google Chrome. I hope it helps you out...
Deploying a Microsoft Access application in a normal Windows environment is not difficult but takes a few steps. The method and script provided here will - literally - turn the process into a one-click process for the user, even in a Citrix environm…

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month