Solved

HASH and CRC comparision and details

Posted on 2004-09-15
3
413 Views
Last Modified: 2008-02-01
Hi , we are working on a project which requires the de-duplication(identifying the duplicates) of files by comparing . so we got to know it can be done by creating a unique hexadecimal string using CRC32 and Hash functions(i.e,MD5 hash,sha-1,crc etc) so i would like to know which is better to use ,how does CRC and MD5 hash differ in their alogorithm and string creating ways ?how fast are they with respect to each other.
0
Comment
Question by:cmatian
3 Comments
 
LVL 3

Accepted Solution

by:
Validor earned 250 total points
Comment Utility
I've done this very task before (and many variations of it).  In your case, I would recommend CRC32.  

If I may make a recommendation, it is faster to check 3 things when searching for duplicate files.  The first two are very fast and may avoid a CRC check.

1) If file sizes differ, they are not identical.
2) If file timestamps differ, they may not be identical (you decide).
3) If CRC32 checksums differ, they are not identical.

Most of the time, a HASH is used to "represent" data for security purposes.  A CRC (checksum) is used in the same way, but usually in a different situation.  It is used where security is not an issue.  

MD5 is supposed to be difficult to reverse.  CRC is very easy to reverse using brute force.  Both can be used to see if two pieces of data are identical.  Checksums are usually faster than a good hash, though CRC32 is the fastest.   Adler32 is faster and easier to implement.  However, with a table-driven implementation, it's fast enough.  Most other checksums have a higher margin of error.  CRC32 has a relatively small margin of error, but MD5 and most hashes have a MUCH smaller margin of error.

MD5 is good for password verification (sometimes called an MD5 shared secret password) or as a proxy for private data, and CRC32 is good for comparing files or data blocks.  Where speed is more important.

CRC32 is also much smaller and stores better in a database.
0

Featured Post

Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

Join & Write a Comment

If you’re thinking to yourself “That description sounds a lot like two people doing the work that one could accomplish,” you’re not alone.
In this post we will learn how to connect and configure Android Device (Smartphone etc.) with Android Studio. After that we will run a simple Hello World Program.
An introduction to basic programming syntax in Java by creating a simple program. Viewers can follow the tutorial as they create their first class in Java. Definitions and explanations about each element are given to help prepare viewers for future …
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now