Solved

Looking for duplicate file finder / remover software

Posted on 2008-06-12
9
448 Views
Last Modified: 2013-11-14
We're looking for a software that can find duplicate files of all types either locally or remotely (network share). The app should be able to find files not just by name but by "signature" (something like MD5 hash). It should be able to find duplicates even if the filenames do not match but contents are the same (that's where other, more "intelligent" comparison mechanisms come into play). It hast to work on Windows XP/2003 and should be robust. I would like to avoid java-based solutions, as they tend to be slow and require java, which is not available on every system.

Please don't post top 5 google searches - I can use google myself quite nicely. This will need to run on XP/2K3, free solution would be ideal. Any commercial software should be reasonably priced and preferrably by an established company, not some fly-by-night operation that won't be there tomorrow to support the product.

Thanks in advance!
0
Comment
Question by:CynepMeH
  • 4
  • 3
9 Comments
 
LVL 70

Accepted Solution

by:
garycase earned 500 total points
ID: 21776293
Duplicate File Finder [Available here:  http://brooksyounce.byethost13.com/ ] will do what you want ... it finds all duplicates regardless of the filenames, dates, etc.   It WILL, of course, take a potentially very long time if you have it set to search a very large set of files ... but it does the job.   I recently ran it overnight to identify duplicates across 3 750GB drives => I didn't time the search (I was gone all of the next day) ... but it definitely took a long time.   I repeated the search with the "Fast Search" option checked, and it was MUCH faster ... and found the same set of duplicates, even though the "Fast Search" option has a "less accurate" caveat by it.

It works fine on local drives; external drives; network drives; etc.

0
 
LVL 11

Author Comment

by:CynepMeH
ID: 21797105
Thanks for suggestion but it didn't work out - it is too limited in features and a bit on a sluggish side.
I think at this point I'm just about ready to give up on "free" solutions. I've tried several and they were worth exactly what I paid for them. I think I should focus on commercial solutions but they need to be comprehensive in terms of features. If there's any commercial products you folks may be familiar with, please post.

Thanks!
0
 
LVL 70

Expert Comment

by:garycase
ID: 21797543
Any tool that has to examine every file regardless of the name, date, or any other identifying feature will be "sluggish" with a large # of files.   The tool I suggested works MUCH faster if you check the "Fast Search" option ... and is still very good at finding duplicates (even though it does warn you that this is "less accurate").    Even commercial tools will be "sluggish" with the requirements you've noted to find the duplications without any limitations on the search parameters.
0
What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

 
LVL 11

Author Comment

by:CynepMeH
ID: 21815389
Gary, understood - I think performance is not a critical requirements, as long as the job is done. The issue is that we may have number of files that may be named differently but contents are the same. As an example, we may have a user A save a .Net Framework 2.0 file as "dotnetfx.exe". User B may save the same file as "dotnetframework.exe" and User C may have saved it as "MSKBXXXXXX.EXE". I want to be able to identify these types of duplicates as well (MD5 hash?).

So far, I haven't seen within this software any way to identify such instances - maybe I'm missing something.
0
 
LVL 70

Expert Comment

by:garycase
ID: 21816531
As long as the location of those files is within the search paths you set, it will identify them as duplicates in that case.

For example, create a file called AAAAA.111 (it can be anything ... just copy some other file and rename it; create a new doc; etc.).   Store it at a known location ... say C:\TestFolder\AAAAA.111

Now copy the file somewhere ... perhaps to D:\MoreStuff\ ... and rename the copy (say to BBBBB.222), so the file at D:\MoreStuff\BBBBB.222  is now a duplicate of C:\TestFolder\AAAAA.111

Now copy the file somewhere else ... perhaps to a mapped network drive K:\DistantStuff\ ... and rename it yet again, perhaps to CCCCC.333, so the file K:\DistantStuff\CCCCC.333 is yet another duplicate of the same file.

When you run Duplicate File Finder, the first thing you need to do is use the "Add Path" button to set the search paths for the duplicates.  It will find ANY duplicates within those search paths, regardless of their names, creation dates, etc. => if they're duplicates, they'll be identified.

In the example above, if you clicked "Add Path" and selected C:\TestFolder as a path; then clicked "Add Path" and selected K:\DistantStuff"  (or if the network location isn't mapped you can simply "point" to the network location); then click on Start Search, it would find AAAAA.111 and CCCCC.333 as duplicates, but would not find BBBBB.222 because you didn't include D:\MoreStuff in the search path.     The search would be quicker if you checked the "Fast Search" box before you clicked on Start Search.    Note you could also simply set the paths to C:, D:, and K: and it would find all 3 of these duplicates .. but this would take a LONG time as it would check EVERY file on all three drives => but it would indeed find all instances of duplicates (which is what you asked for).
0
 
LVL 11

Author Comment

by:CynepMeH
ID: 21816888
Gary, thanks for your suggestions. I want to see what else may surface, although it is beginning to look like we may have to go to a commercial software after all, as I am told we'll need to produce reports too.

I'm currently looking at Quest and NTP Software tools - we'll see how much they cost and what they can do.

If you know of any commercial products that can provide more features and benefits, please advise.

Thanks.
0
 
LVL 70

Expert Comment

by:garycase
ID: 22134938
The product I suggested would work fine ==> as I noted, it finds "... duplicates even if the filenames do not match ...";  works "... on Windows XP/2003 ...";  is reasonably "... robust ..." ; is not "... java-based ...";  and is free ("... free solution would be ideal ...").   It is, as the author noted, a bit "... sluggish ..." => but any product that's not index-based and has to search the entire path will take a bit of time.

Bottom line:  Duplicate File Finder would certainly seem to have been a solution that would resolve the question.   The fact that the asker elected to go a different path doesn't negate that solution.
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Problem description :  Some external hard disks / USB flash drives do not show actual space as mentioned in the factory settings. This is a common problem when you use an 8 GB USB drive to make it bootable to install a firmware/ driver on a serv…
AWS Glacier is Amazons cheapest storage option and is their answer to a ‘Cold’ storage service.  Customers primarily use this service for archival purposes and storage of infrastructure backups.  Its unlimited storage potential and low storage cost …
This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…
This Micro Tutorial will teach you how to reformat your flash drive. Sometimes your flash drive may have issues carrying files so this will completely restore it to manufacturing settings. Make sure to backup all files before reformatting. This w…

759 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now