<

Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x

How to find and eliminate duplicate files on your Windows systems.

Posted on
6,258 Points
159 Views
1 Endorsement
Last Modified:
Experience Level: Intermediate
5:34
Ed Covney
Retired USN in '88. Then IT & s/w dev. Fully retired in 2015. Now practicing math skills, long neglected, and learning VBA (to demo math).
Finding and deleting duplicate (picture) files can be a time consuming task. My wife and I, our three kids and their families all share one dilemma: Managing our pictures. Between desktops, laptops, phones, tablets, and cameras; over the last decade or two, we've accumulated and shared many 10s of thousands of pictures. And in an effort to begin managing them, it seemed that finding and deleting duplicates was a pretty good start. Easier said than done.

About 8 or 9 years ago, I wrote a program that enumerated all files of a type (extension) in any drive (or folder). Once all the file names are "listed", I added a section to then get an MD5 hash of each file. As each file is hashed, it writes 4 pieces of data to a tab (or comma) delimited text file:
32h byte hash,  file size,  file name,  full file name.

Video Steps

1. Select Drive

When you run DupeFF.exe, all available drive letters will be listed down the left side - choose one.

2. Select a Folder

You can search entire drives, or select a specific folder. Whatever you select will be searched entirely including all sub-folders.

3. Select a File Type

You can search for any file type. If you want to select ALL files (all types),
click the last type "*" listed. Or select any other type by entering the extension where the "*" is. To select any excel spreadsheet type, enter "xl*" (it will find xls, xlsm, xlsb, etc.)

4. Click the #4 Enumerate File List button

The program will list all the files in finds based on the criteria you previously provided. By default, the final report text file will be "Tab Delimited". If you prefer, click on the "Commas" check box.  Also by default, only the first 64K characters of files are actually hashed. Again, if you desire, check the "Hash Full File Content" instead.

5. Hash all enumerated files

 Click the #5 "Hash Enumerated File List" button. When complete, it creates a "txtx" text file. Note, the extra x. You probably won't have a "txtx" associated with a program, as it can be very large file. If it's a small file, let notepad open it, else open it with wordpad or word. You can also import it directly into Excel but in the end no matter where it is, we want to place a full copy into the clipboard.

6. Open DupeFF.xlsm

Once open, review the information on the "Instructions" tab. Taking the steps recommended, you'll be able to preview duplicates and once you're convinced duplicates are REALLY duplicates, you can easily delete even thousands of them in seconds.

EE allows me to attach the spread sheet  "DupeFF.xlsm"  that I use in the video, but not the program. For those who'd like a copy of the DupeFF.exe program or its Delphi XE2 source code, please contact me directly:
ed(dot)covney(at)gmail(dot)com.
DupeFF.xlsm
 
1
Comment
Author:Ed Covney
0 Comments

Featured Post

Take our survey for a chance to win!

As a valued customer of Targus, we’d like to ask you a few questions about us. As thanks, you will be automatically entered for a chance to win a $500 VISA gift card. To enter, just complete the survey by September 15, 2017.

Join & Write a Comment

Configuring Remote Assistance for use with SCCM
By default Outlook 2016 displays only one time zone in the Calendar. The following article explains how to display two time zones in one calendar view.

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month