?
Solved

HASH and CRC comparision and details

Posted on 2004-09-15
3
Medium Priority
?
420 Views
Last Modified: 2008-02-01
Hi , we are working on a project which requires the de-duplication(identifying the duplicates) of files by comparing . so we got to know it can be done by creating a unique hexadecimal string using CRC32 and Hash functions(i.e,MD5 hash,sha-1,crc etc) so i would like to know which is better to use ,how does CRC and MD5 hash differ in their alogorithm and string creating ways ?how fast are they with respect to each other.
0
Comment
Question by:cmatian
1 Comment
 
LVL 3

Accepted Solution

by:
Validor earned 1000 total points
ID: 12079161
I've done this very task before (and many variations of it).  In your case, I would recommend CRC32.  

If I may make a recommendation, it is faster to check 3 things when searching for duplicate files.  The first two are very fast and may avoid a CRC check.

1) If file sizes differ, they are not identical.
2) If file timestamps differ, they may not be identical (you decide).
3) If CRC32 checksums differ, they are not identical.

Most of the time, a HASH is used to "represent" data for security purposes.  A CRC (checksum) is used in the same way, but usually in a different situation.  It is used where security is not an issue.  

MD5 is supposed to be difficult to reverse.  CRC is very easy to reverse using brute force.  Both can be used to see if two pieces of data are identical.  Checksums are usually faster than a good hash, though CRC32 is the fastest.   Adler32 is faster and easier to implement.  However, with a table-driven implementation, it's fast enough.  Most other checksums have a higher margin of error.  CRC32 has a relatively small margin of error, but MD5 and most hashes have a MUCH smaller margin of error.

MD5 is good for password verification (sometimes called an MD5 shared secret password) or as a proxy for private data, and CRC32 is good for comparing files or data blocks.  Where speed is more important.

CRC32 is also much smaller and stores better in a database.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

If you’re thinking to yourself “That description sounds a lot like two people doing the work that one could accomplish,” you’re not alone.
Today, the web development industry is booming, and many people consider it to be their vocation. The question you may be asking yourself is – how do I become a web developer?
In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable …
Screencast - Getting to Know the Pipeline

621 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question