views:

138

answers:

2

I have keys that can vary in length between 1 and 256 characters*; how can I calculate the probability that any two keys will collide when using md5 (baring a brute force solution of trying each key)?

* the character set is restricted to [a-z.-]

+2  A: 

Check out the birthday problem. It is exactly what you are looking for.

Ayman Hourieh
+4  A: 

Take a look at the birthday paradox, which will help you analyse this. In short, since MD5 is a 128bit hash, you need 264 items before the probably of a collision rises to 50%. There's an assumption there that MD5 is distributed evenly over that 128bit space, which I would believe it doesn't do, but gets close.

If you want to get an idea of how those numbers rack up against your key space, let's assume all your keys are 256 chars, you have 26256 possible keys, or 21023, and of course you have a 100% chance of collision after 2128 keys :)

Paul Dixon