[2 days left] What’s wrong with your cloud strategy? Learn why multicloud solutions matter with Nimble Storage.Register Now

x
?
Solved

HASH and CRC comparision and details

Posted on 2004-09-15
3
Medium Priority
?
419 Views
Last Modified: 2008-02-01
Hi , we are working on a project which requires the de-duplication(identifying the duplicates) of files by comparing . so we got to know it can be done by creating a unique hexadecimal string using CRC32 and Hash functions(i.e,MD5 hash,sha-1,crc etc) so i would like to know which is better to use ,how does CRC and MD5 hash differ in their alogorithm and string creating ways ?how fast are they with respect to each other.
0
Comment
Question by:cmatian
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
3 Comments
 
LVL 3

Accepted Solution

by:
Validor earned 1000 total points
ID: 12079161
I've done this very task before (and many variations of it).  In your case, I would recommend CRC32.  

If I may make a recommendation, it is faster to check 3 things when searching for duplicate files.  The first two are very fast and may avoid a CRC check.

1) If file sizes differ, they are not identical.
2) If file timestamps differ, they may not be identical (you decide).
3) If CRC32 checksums differ, they are not identical.

Most of the time, a HASH is used to "represent" data for security purposes.  A CRC (checksum) is used in the same way, but usually in a different situation.  It is used where security is not an issue.  

MD5 is supposed to be difficult to reverse.  CRC is very easy to reverse using brute force.  Both can be used to see if two pieces of data are identical.  Checksums are usually faster than a good hash, though CRC32 is the fastest.   Adler32 is faster and easier to implement.  However, with a table-driven implementation, it's fast enough.  Most other checksums have a higher margin of error.  CRC32 has a relatively small margin of error, but MD5 and most hashes have a MUCH smaller margin of error.

MD5 is good for password verification (sometimes called an MD5 shared secret password) or as a proxy for private data, and CRC32 is good for comparing files or data blocks.  Where speed is more important.

CRC32 is also much smaller and stores better in a database.
0

Featured Post

Ask an Anonymous Question!

Don't feel intimidated by what you don't know. Ask your question anonymously. It's easy! Learn more and upgrade.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this post we will learn how to connect and configure Android Device (Smartphone etc.) with Android Studio. After that we will run a simple Hello World Program.
If you are a mobile app developer and especially develop hybrid mobile apps then these 4 mistakes you must avoid for hybrid app development to be the more genuine app developer.
Six Sigma Control Plans
Starting up a Project

656 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question