Solved

Looking for duplicate file finder / remover software

Posted on 2008-06-12
9
464 Views
Last Modified: 2013-11-14
We're looking for a software that can find duplicate files of all types either locally or remotely (network share). The app should be able to find files not just by name but by "signature" (something like MD5 hash). It should be able to find duplicates even if the filenames do not match but contents are the same (that's where other, more "intelligent" comparison mechanisms come into play). It hast to work on Windows XP/2003 and should be robust. I would like to avoid java-based solutions, as they tend to be slow and require java, which is not available on every system.

Please don't post top 5 google searches - I can use google myself quite nicely. This will need to run on XP/2K3, free solution would be ideal. Any commercial software should be reasonably priced and preferrably by an established company, not some fly-by-night operation that won't be there tomorrow to support the product.

Thanks in advance!
0
Comment
Question by:CynepMeH
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
9 Comments
 
LVL 70

Accepted Solution

by:
garycase earned 500 total points
ID: 21776293
Duplicate File Finder [Available here:  http://brooksyounce.byethost13.com/ ] will do what you want ... it finds all duplicates regardless of the filenames, dates, etc.   It WILL, of course, take a potentially very long time if you have it set to search a very large set of files ... but it does the job.   I recently ran it overnight to identify duplicates across 3 750GB drives => I didn't time the search (I was gone all of the next day) ... but it definitely took a long time.   I repeated the search with the "Fast Search" option checked, and it was MUCH faster ... and found the same set of duplicates, even though the "Fast Search" option has a "less accurate" caveat by it.

It works fine on local drives; external drives; network drives; etc.

0
 
LVL 11

Author Comment

by:CynepMeH
ID: 21797105
Thanks for suggestion but it didn't work out - it is too limited in features and a bit on a sluggish side.
I think at this point I'm just about ready to give up on "free" solutions. I've tried several and they were worth exactly what I paid for them. I think I should focus on commercial solutions but they need to be comprehensive in terms of features. If there's any commercial products you folks may be familiar with, please post.

Thanks!
0
 
LVL 70

Expert Comment

by:garycase
ID: 21797543
Any tool that has to examine every file regardless of the name, date, or any other identifying feature will be "sluggish" with a large # of files.   The tool I suggested works MUCH faster if you check the "Fast Search" option ... and is still very good at finding duplicates (even though it does warn you that this is "less accurate").    Even commercial tools will be "sluggish" with the requirements you've noted to find the duplications without any limitations on the search parameters.
0
Raise the IQ of Your IT Alerts

From IT major incidents to manufacturing line slowdowns, every business process generates insights that need to reach the people required to take action. You need a platform that integrates with your business tools to create fully enabled DevOps toolchains.

You need xMatters.

 
LVL 11

Author Comment

by:CynepMeH
ID: 21815389
Gary, understood - I think performance is not a critical requirements, as long as the job is done. The issue is that we may have number of files that may be named differently but contents are the same. As an example, we may have a user A save a .Net Framework 2.0 file as "dotnetfx.exe". User B may save the same file as "dotnetframework.exe" and User C may have saved it as "MSKBXXXXXX.EXE". I want to be able to identify these types of duplicates as well (MD5 hash?).

So far, I haven't seen within this software any way to identify such instances - maybe I'm missing something.
0
 
LVL 70

Expert Comment

by:garycase
ID: 21816531
As long as the location of those files is within the search paths you set, it will identify them as duplicates in that case.

For example, create a file called AAAAA.111 (it can be anything ... just copy some other file and rename it; create a new doc; etc.).   Store it at a known location ... say C:\TestFolder\AAAAA.111

Now copy the file somewhere ... perhaps to D:\MoreStuff\ ... and rename the copy (say to BBBBB.222), so the file at D:\MoreStuff\BBBBB.222  is now a duplicate of C:\TestFolder\AAAAA.111

Now copy the file somewhere else ... perhaps to a mapped network drive K:\DistantStuff\ ... and rename it yet again, perhaps to CCCCC.333, so the file K:\DistantStuff\CCCCC.333 is yet another duplicate of the same file.

When you run Duplicate File Finder, the first thing you need to do is use the "Add Path" button to set the search paths for the duplicates.  It will find ANY duplicates within those search paths, regardless of their names, creation dates, etc. => if they're duplicates, they'll be identified.

In the example above, if you clicked "Add Path" and selected C:\TestFolder as a path; then clicked "Add Path" and selected K:\DistantStuff"  (or if the network location isn't mapped you can simply "point" to the network location); then click on Start Search, it would find AAAAA.111 and CCCCC.333 as duplicates, but would not find BBBBB.222 because you didn't include D:\MoreStuff in the search path.     The search would be quicker if you checked the "Fast Search" box before you clicked on Start Search.    Note you could also simply set the paths to C:, D:, and K: and it would find all 3 of these duplicates .. but this would take a LONG time as it would check EVERY file on all three drives => but it would indeed find all instances of duplicates (which is what you asked for).
0
 
LVL 11

Author Comment

by:CynepMeH
ID: 21816888
Gary, thanks for your suggestions. I want to see what else may surface, although it is beginning to look like we may have to go to a commercial software after all, as I am told we'll need to produce reports too.

I'm currently looking at Quest and NTP Software tools - we'll see how much they cost and what they can do.

If you know of any commercial products that can provide more features and benefits, please advise.

Thanks.
0
 
LVL 70

Expert Comment

by:garycase
ID: 22134938
The product I suggested would work fine ==> as I noted, it finds "... duplicates even if the filenames do not match ...";  works "... on Windows XP/2003 ...";  is reasonably "... robust ..." ; is not "... java-based ...";  and is free ("... free solution would be ideal ...").   It is, as the author noted, a bit "... sluggish ..." => but any product that's not index-based and has to search the entire path will take a bit of time.

Bottom line:  Duplicate File Finder would certainly seem to have been a solution that would resolve the question.   The fact that the asker elected to go a different path doesn't negate that solution.
0

Featured Post

Secure Your WordPress Site: 5 Essential Approaches

WordPress is the web's most popular CMS, but its dominance also makes it a target for attackers. Our eBook will show you how to:

Prevent costly exploits of core and plugin vulnerabilities
Repel automated attacks
Lock down your dashboard, secure your code, and protect your users

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When we purchase storage, we typically are advertised storage of 500GB, 1TB, 2TB and so on. However, when you actually install it into your computer, your 500GB HDD will actually show up as 465GB. Why? It has to do with the way people and computers…
Many businesses neglect disaster recovery and treat it as an after-thought. I can tell you first hand that data will be lost, hard drives die, servers will be hacked, and careless (or malicious) employees can ruin your data.
This video teaches viewers how to encrypt an external drive that requires a password to read and edit the drive. All tasks are done in Disk Utility. Plug in the external drive you wish to encrypt: Make sure all previous data on the drive has been …
This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…

717 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question