asked on

avoiding collisions with SHA1

I wasn't sure which section I should ask this question in, because it isn't a true security question, but it is a question about using the SHA1 algorithm, so I figured that I'd have the best luck in the security section.

I'm using SHA1 as a hash function in my application, I was wondering if anybody knew what factors to use (or not to use) when giving data to the function to compute the hash for avoiding collisions. For example is there a range of optimal input lengths that reduce collisions? Is it bad to use a limited range of input values (ie. only ascii chars). If anybody has any insight on this it would be very useful.

Thanks,
Nick

ASKER CERTIFIED SOLUTION

chandrasuresh

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

jhance

It _is_ possible to have two different plaintexts produce the same hash with SHA or with any other hash but the probability is extremely low. The best ways to ensure that you don't get a collision is to:

1) Use a strong key. SHA1 has some weak keys for seeding the hash. Avoid any of these.
2) Use a long enough hash size. Obviously, shorter hashes, while faster to generate, are more likely to produce a collision.

chandrasuresh

SHA1 does not use any key. It just produces a message digest of the input data given. Also, SHA1 produces a fixed length hash of 20 bytes, whatever is the input length.

jhance, Can you please detail about the key and the hash length which you have specified?

jhance

chandra,

Yes, you are correct. I was thinking HMAC-SHA1 but this question is about SHA-1.

Regardless, the probability of a collision with 20 bytes of hash is quite low.