tags:

views:

218

answers:

4

Suppose I have one URL.

http://google.com ...I'd like to turn it into a hash. S3jvZLSDK. Then take this hash and reverse it! into http://google.com.

To you geeks out there--what is the BEST method to do this for near-ZERO collision?

+9  A: 

If you can reverse it, then by definition it isn't a hash. It's an encoding. Any encoding will have zero collisions (otherwise it wouldn't be able to accurately reverse it).

A common encoding for this purpose is base64.

Matthew Scharley
+6  A: 

The whole point of a hash is that it isn't reversible (short of brute-force, trying every possible input until the output matches).

Is this for a URL shortening service? The usual way of doing this is to store http://google.com in a database under a unique key, and when someone queries with that key (which could be ‘S3jvZLSDK’ if you really like random strings, but could just as easily be ‘1’) you spit the value you remembered back out again.

bobince
Even the brute-force approach wouldn't work. You are nearly guaranteed to find a wrong answer, since there are many more possible inputs than there are hash codes.
Igor ostrovsky
In most cases you don't need to find the original value, only a value that would generate the same hash. But yes, in this case brute forcing the hash would be useless.
Matthew Scharley
For URLs you'd have a better chance of bruting the right input than any old string... http://google.com would be relatively easy to find, but for anything with a longer path or parameters no, there'd be no chance.
bobince
A: 

There is no way to get near-zero collisions, but you can make collisions arbitrarily unlikely if you use a cryptographic hash with a large output size. The SHA-2 family contains a version with a 512 bit key; that should do you.

Steven Sudit
+2  A: 

Are you trying to write something like a URL shortener? If so, just generate a random string, then use a big hash table, relational database (with indexes), etc. to relate keys (S3jvZLSDK) to URLs (google.com) and vice versa.

That will give you an easy solution for handling collisions (key already exists, URL already exists) and fast lookups.

lost-theory