I'm doing a presentation on MD5 collisions and I'd like to give people any idea how likely a collision is.
It would be good to have two blocks of text which hash to the same thing, and explain how many combinations of [a-zA-Z ] were needed before I hit a collision.
The obvious answer is hash every possible combination until hit two hashes the same. So how would you go about coding this. As a quick experiment I tried hashing every combination of 5 columns of [A-Z], storing this in a .net hashtable and catching the collision exception. Two problems with this - the hashtable eventually times out, and I'm pretty sure I'm going to need A LOT more characters.
Obviously this data structure is too big to handle in memory, so now I'll have to get a database involved. Also sounds like a good project to test out azure - a bit like these guys.
Can anyone point me in the direction of an efficient way of doing this?