views:

256

answers:

3

Given a set of 100 different strings of equal length, how can you quantify the probability that a SHA1 digest collision for the strings is unlikely... ?

A: 

That's Birthday Problem - the article provides nice approximations that make it quite easy to estimate the probability. Actual probability will be very very very low - see this question for an example.

sharptooth
A: 

Well, it would be 1 * ((2^160 - 1) / 2^160) * ((2^160 - 2) / 2^160) * ... * ((2^160 - 99) / 2^160).

It's pretty unlikely. You'd have to have many more strings for it to be a remote possibility.

Take a look at the table on this page on Wikipedia; just interpolate between the rows for 128 bits and 256 bits.

Anthony Mills
+9  A: 

alt text

Are the 160 bit hash values generated by SHA-1 large enough to ensure the fingerprint of every block is unique? Assuming random hash values with a uniform distribution, a collection of n different data blocks and a hash function that generates b bits, the probability p that there will be one or more collisions is bounded by the number of pairs of blocks multiplied by the probability that a given pair will collide.

(source : http://bitcache.org/faq/hash-collision-probabilities)

Peter
what a beautiful answer, an image, a quote and a reference: nice!
Sander Versluys
hey man, thanks, and thank you google :-)
Peter
In conclusion, the likelihood of a collision is in the order of 10^-45. That is very, *very* unlikely.
Paul Lammertsma