Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win


Looking for duplicate file finder / remover software

Posted on 2008-06-12
Medium Priority
Last Modified: 2013-11-14
We're looking for a software that can find duplicate files of all types either locally or remotely (network share). The app should be able to find files not just by name but by "signature" (something like MD5 hash). It should be able to find duplicates even if the filenames do not match but contents are the same (that's where other, more "intelligent" comparison mechanisms come into play). It hast to work on Windows XP/2003 and should be robust. I would like to avoid java-based solutions, as they tend to be slow and require java, which is not available on every system.

Please don't post top 5 google searches - I can use google myself quite nicely. This will need to run on XP/2K3, free solution would be ideal. Any commercial software should be reasonably priced and preferrably by an established company, not some fly-by-night operation that won't be there tomorrow to support the product.

Thanks in advance!
Question by:CynepMeH
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
LVL 70

Accepted Solution

garycase earned 2000 total points
ID: 21776293
Duplicate File Finder [Available here:  http://brooksyounce.byethost13.com/ ] will do what you want ... it finds all duplicates regardless of the filenames, dates, etc.   It WILL, of course, take a potentially very long time if you have it set to search a very large set of files ... but it does the job.   I recently ran it overnight to identify duplicates across 3 750GB drives => I didn't time the search (I was gone all of the next day) ... but it definitely took a long time.   I repeated the search with the "Fast Search" option checked, and it was MUCH faster ... and found the same set of duplicates, even though the "Fast Search" option has a "less accurate" caveat by it.

It works fine on local drives; external drives; network drives; etc.

LVL 11

Author Comment

ID: 21797105
Thanks for suggestion but it didn't work out - it is too limited in features and a bit on a sluggish side.
I think at this point I'm just about ready to give up on "free" solutions. I've tried several and they were worth exactly what I paid for them. I think I should focus on commercial solutions but they need to be comprehensive in terms of features. If there's any commercial products you folks may be familiar with, please post.

LVL 70

Expert Comment

ID: 21797543
Any tool that has to examine every file regardless of the name, date, or any other identifying feature will be "sluggish" with a large # of files.   The tool I suggested works MUCH faster if you check the "Fast Search" option ... and is still very good at finding duplicates (even though it does warn you that this is "less accurate").    Even commercial tools will be "sluggish" with the requirements you've noted to find the duplications without any limitations on the search parameters.
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

LVL 11

Author Comment

ID: 21815389
Gary, understood - I think performance is not a critical requirements, as long as the job is done. The issue is that we may have number of files that may be named differently but contents are the same. As an example, we may have a user A save a .Net Framework 2.0 file as "dotnetfx.exe". User B may save the same file as "dotnetframework.exe" and User C may have saved it as "MSKBXXXXXX.EXE". I want to be able to identify these types of duplicates as well (MD5 hash?).

So far, I haven't seen within this software any way to identify such instances - maybe I'm missing something.
LVL 70

Expert Comment

ID: 21816531
As long as the location of those files is within the search paths you set, it will identify them as duplicates in that case.

For example, create a file called AAAAA.111 (it can be anything ... just copy some other file and rename it; create a new doc; etc.).   Store it at a known location ... say C:\TestFolder\AAAAA.111

Now copy the file somewhere ... perhaps to D:\MoreStuff\ ... and rename the copy (say to BBBBB.222), so the file at D:\MoreStuff\BBBBB.222  is now a duplicate of C:\TestFolder\AAAAA.111

Now copy the file somewhere else ... perhaps to a mapped network drive K:\DistantStuff\ ... and rename it yet again, perhaps to CCCCC.333, so the file K:\DistantStuff\CCCCC.333 is yet another duplicate of the same file.

When you run Duplicate File Finder, the first thing you need to do is use the "Add Path" button to set the search paths for the duplicates.  It will find ANY duplicates within those search paths, regardless of their names, creation dates, etc. => if they're duplicates, they'll be identified.

In the example above, if you clicked "Add Path" and selected C:\TestFolder as a path; then clicked "Add Path" and selected K:\DistantStuff"  (or if the network location isn't mapped you can simply "point" to the network location); then click on Start Search, it would find AAAAA.111 and CCCCC.333 as duplicates, but would not find BBBBB.222 because you didn't include D:\MoreStuff in the search path.     The search would be quicker if you checked the "Fast Search" box before you clicked on Start Search.    Note you could also simply set the paths to C:, D:, and K: and it would find all 3 of these duplicates .. but this would take a LONG time as it would check EVERY file on all three drives => but it would indeed find all instances of duplicates (which is what you asked for).
LVL 11

Author Comment

ID: 21816888
Gary, thanks for your suggestions. I want to see what else may surface, although it is beginning to look like we may have to go to a commercial software after all, as I am told we'll need to produce reports too.

I'm currently looking at Quest and NTP Software tools - we'll see how much they cost and what they can do.

If you know of any commercial products that can provide more features and benefits, please advise.

LVL 70

Expert Comment

ID: 22134938
The product I suggested would work fine ==> as I noted, it finds "... duplicates even if the filenames do not match ...";  works "... on Windows XP/2003 ...";  is reasonably "... robust ..." ; is not "... java-based ...";  and is free ("... free solution would be ideal ...").   It is, as the author noted, a bit "... sluggish ..." => but any product that's not index-based and has to search the entire path will take a bit of time.

Bottom line:  Duplicate File Finder would certainly seem to have been a solution that would resolve the question.   The fact that the asker elected to go a different path doesn't negate that solution.

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The question appears often enough, how do I transfer my data from my old server to the new server while preserving file shares, share permissions, and NTFS permisions.  Here are my tips for handling such a transfer.
Windows Server 2003 introduced persistent Volume Shadow Copies and made 2003 a must-do upgrade.  Since then, it's been a must-implement feature for all servers doing any kind of file sharing.
This video teaches viewers how to encrypt an external drive that requires a password to read and edit the drive. All tasks are done in Disk Utility. Plug in the external drive you wish to encrypt: Make sure all previous data on the drive has been …
Despite its rising prevalence in the business world, "the cloud" is still misunderstood. Some companies still believe common misconceptions about lack of security in cloud solutions and many misuses of cloud storage options still occur every day. …

610 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question