• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1529
  • Last Modified:

Hash Collision (easy)


I have seen MD5 hash collisions and I already created 2 exe files with same MD5, that's all I know.

I want to learn it in deep, for startup, I want to start with COMP128 v1 which is 10 years old and seems to have really easy collision. I see/read papers about it, but I can't really understand so I can't launch attack.

I'm writing my own code for this attack, but as I don't understand paper, I can't do it properly. I have everything needed, actual SIM cards, known KIs, COMP128 algorithm everything.

For example see this paper:



It says:

The attack exploits a lack of diffusion: there's a narrow ``pipe'' inside COMP128. In particular, bytes i,i+8,i+16,i+24 at the output of the second round depend only on bytes i,i+8,i+16,i+24 of the input to COMP128. (By ``round'', I refer to one layer of ``butterflies'' and S-boxes; there are a total of 5*8 rounds in COMP128.) Bytes i,i+8 of the COMP128 input are bytes i,i+8 of the key, and bytes i+16,i+24 of the COMP128 input are bytes i,i+8 of the challenge input.

I can't understand:

Could someone give me some easy documents/references for learning this type of knowledge?

Thanks from now
  • 6
  • 5
  • 2
2 Solutions
nociSoftware EngineerCommented:
I am not sure what you are exactly looking for:
(general overview)
How about: (illustrated)

Cryptography handbook:

(A lot of references)
Or Bruce Schneiers site (a lot of valuable information):

CSecurityAuthor Commented:
I want to learn Hash collisions, none of the links are related. Specially I want to learn COMP128 collision in depth.

nociSoftware EngineerCommented:
You asked about:


Those are cryptography concepts, if you don't know what those concepts are how do you expect to get "cracking"
The excerpt you have show explains the crux of the problem with COMP128, in terms of cryptography "macro's"
like rounds, s-boxes etc.so if you understand the concepts behind them you can expect to interpret that short piece.
rounds, butterflies are not "cracking" of "collision" concepts. Hence the references to fairly basic cryptography stuff.
Explaining each concept here is more or less copy-typing what is stated in the references.

Collisions are nothing else then multiple inputs generating the same hash outcome.
They best work with concepts like passwords and might help when hashes are used as signatures. (datacom).
Worried about phishing attacks?

90% of attacks start with a phish. It’s critical that IT admins and MSSPs have the right security in place to protect their end users from these phishing attacks. Check out our latest feature brief for tips and tricks to keep your employees off a hackers line!

btanExec ConsultantCommented:
For a start if you are interested in MD5 hash which I thought we should look at it and spend the effort since it is widely used especially in certificate generated using it instead of SHA1. The couple of link below may be of interest

MD5 Collision Demo @ http://www.mscs.dal.ca/~selinger/md5collision/
MD5 collision finder @ http://www2.mat.dtu.dk/people/S.Thomsen/wangmd5/

Apparently, you may have interest in it since COMP128 is widely used in h/w implementation like the GSM A3/A8 schemes. Strictly speaking, hash is used for integrity checks unlike COMP128 which is implemented for the A3 (authentication) and A8 (key derivation) algorithms using a central core algorithm doing hash function and based on a butterfly structure. The output of the algorithm contains the response used for authentication and session key for the A5 stream cipher used for encryption. COMP128 is considered unsafe because of the small changes in the hash input is not sufficiently dispersed resulting in higher chances of collision at the output.

I suggest looking at the slides, the diagram may help visualisation
@ http://www.tcs.hut.fi/Studies/T-79.514/slides/S5.Brumley-comp128.pdf

I will attempt to explain (pardon me as I am no cryptographer).

For 5*8 rounds, it meant there are total of 8 cycles (or rounds) where the use of 128 bits (16 bytes) Ki is placed as the first 0-15th bytes appended by 128 bits RAND forming the 16-31th bytes. This total of 32 bytes is then run through a set of operation which comprises of 5 round of compression in each of the cycle. The compression is using sort of "butterfly structure".  It is graphically like butterfly. Maybe it is best to check out the slides, I will not say I understand all of it (best effort)

For S-box, it is referencing to substitution box or basically a look up table based on your inputs
@ http://en.wikipedia.org/wiki/S-box

For Butterfly, it is a fundamental mathematical operation in many fast transforms, and shows the processing of two elements in one stage of the transform. The transform is referring to mathematics such as Fourier transform. The name comes from a common graphic depiction of FFT operation which has the shape of a standing "hourglass," or butterfly wings on edge.

For crypto terms, can do a quick search for it and go into that link, good for quick explanation and each statement has links to assist further @ http://www.ciphersbyritter.com/GLOSSARY.HTM

Having said all these, as for the collision for COMP128, if the idea is to know in depth how it works then to appreciate how it can collide. Agree the need to understand the crypto aspects but do note it take another whole lots of fundamental (esp math working) again to overcome another "hill" - not straightforward. But if looking at application, maybe the simple pseudo codes may help
@ http://de.wikipedia.org/wiki/COMP128 (can use google to translate)

Hope it helps
CSecurityAuthor Commented:
I know MD5 collision, I don't want to learn about it.

My question is in COMP128 where is says, Round 2 bits has collision, I don't know where is Round 2 and in Output of COMP128 (which is available only) where is Round 2? I don't see any collision in output with some changes in COMP128

I know a little about AES S-Box etc. and I know about COMP128 S-Box, but I want to know more about butterflies, I know that because of butterfly style substition they call it butterflies, but I'm sure after that butterflies major algorithm applies so in output you can't see result of butterflies, so how we can analysis and check collision?
CSecurityAuthor Commented:
Let me tell 1 question specifically instead of generic question:

"By ``round'', I refer to one layer of ``butterflies'' and S-boxes; there are a total of 5*8 rounds in COMP128."

When we give A to COMP128 and receive B output, we can't know result of Round 1 or round 2 or round 3 in hashing of COMP128, we see only LAST result (output) of COMP128, how you want to know result of Round 1 output or Round 2 output?

CSecurityAuthor Commented:
nociSoftware EngineerCommented:
now you assume correct use of the hash function. What if random isn't exactly random,
it changes all the BYTES, but are some bits unaffected? Can you infer more bits from u changed ones? if some bits are not carried over, then you might have an angle to create collisions.

And creating collisions is never supposed to be easy..., it's exactly why why hash functions exist, to validate that a change did or didn't occur in some data. The earliest form of that is checksumming, but that is easily subverted so more sophisticated function got built...

CSecurityAuthor Commented:
I want to understand sentence in COMP128 attack and texts related to ROUNDS, how to apply it? What should I do to see a simple collision in Round X? How to know result of Round X?
nociSoftware EngineerCommented:
Well peek in the data while rounds are calculated for a simple sample.
So dump the data at the end of the rounds 1-7 loop and compare it.
Then you can learn each step and see how bits get moved around...

      /* Round 1-7 */
      for (i=0; i<7; i++) {
            /* x[0-15] = Ki */
            memcpy(x, ki, 16);

            /* Compression */

            /* FormBitFromBytes */
            _comp128_bitsfrombytes(x, bits);

            /* Permutation */
            _comp128_permutation(x, bits);
-- Print here for end of round result --


And maybe you need to print intermediates. from in between function calls.
This is fairly basic computer program debugging , may be you need some more basic educaton w.r.t. programming and mathematics if you want to pursue this task.
CSecurityAuthor Commented:
I have programming skills, no worries.

Ok, let's work practical, not in technical or imaginations or assumptions.

for example I want to use COMP128 algorithm to find Ki of an unknown card.

I send a RAND to real card and receive X output.

I run same RAND with a Ki which I create, assume:
Ki: 1D000000000000009900000000000000

After 7 round or even after round 2 to run 2-R attack, what should I understand from it? How I should compare something/some result with real card's ALL ROUNDS output?

This is my problem. I know hash algorithm, I know debugging, I know programming
nociSoftware EngineerCommented:
to find a weakness you probably start with an All zero RAND number and see if you can predict a bit set in the Ki/Data part after the hash.
if you can find a pattern to  influence the end result then you have a faster way than brute forcing a whole list of values against the keyspace and.
btanExec ConsultantCommented:
understand that the attack is due to narrow pipe, extracted some para for consideration though.


Specifically, the output bytes of the second round of compression i, i+8, i+16, i+24 are dependent ONLY upon their corresponding input bytes. Thus, the ’pipe’ has a width of 4 bytes.

Involves sending lots of challenges to the card, and collecting the responses. Only bytes i and i+8 are varied; the rest are held constant.

How can we detect a collision in the second round? Since each byte depends only upon 2 bytes of the previous rounds output, a collision in the second round will propagate throughout the rest of the algorithm, causing a collision in the response as well.

Since the pipe is 4 bytes wide (7-bit values), the birthday paradox says we can expect a collision to occur after 2^(4*7/2) = 214 challenges.
One challenge equals one query to the card.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: Microsoft Exchange Server

The MCTS: Microsoft Exchange Server 2010 certification validates your skills in supporting the maintenance and administration of the Exchange servers in an enterprise environment. Learn everything you need to know with this course.

  • 6
  • 5
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now