What is the best reversable hash algorithm for a URL? (near-Zero collision!)

tags:

hash

views:

218

answers:

What is the best reversable hash algorithm for a URL? (near-Zero collision!)

Suppose I have one URL.

http://google.com ...I'd like to turn it into a hash. S3jvZLSDK. Then take this hash and reverse it! into http://google.com.

To you geeks out there--what is the BEST method to do this for near-ZERO collision?

+9 A:

If you can reverse it, then by definition it isn't a hash. It's an encoding. Any encoding will have zero collisions (otherwise it wouldn't be able to accurately reverse it).

A common encoding for this purpose is base64.

Matthew Scharley 2009-09-30 22:44:20

+6 A:

The whole point of a hash is that it isn't reversible (short of brute-force, trying every possible input until the output matches).

Is this for a URL shortening service? The usual way of doing this is to store http://google.com in a database under a unique key, and when someone queries with that key (which could be ‘S3jvZLSDK’ if you really like random strings, but could just as easily be ‘1’) you spit the value you remembered back out again.

bobince 2009-09-30 22:45:54

Even the brute-force approach wouldn't work. You are nearly guaranteed to find a wrong answer, since there are many more possible inputs than there are hash codes.

Igor ostrovsky 2009-09-30 22:48:01

In most cases you don't need to find the original value, only a value that would generate the same hash. But yes, in this case brute forcing the hash would be useless.

Matthew Scharley 2009-09-30 22:50:15

For URLs you'd have a better chance of bruting the right input than any old string... http://google.com would be relatively easy to find, but for anything with a longer path or parameters no, there'd be no chance.

bobince 2009-09-30 23:41:39

There is no way to get near-zero collisions, but you can make collisions arbitrarily unlikely if you use a cryptographic hash with a large output size. The SHA-2 family contains a version with a 512 bit key; that should do you.

Steven Sudit 2009-09-30 22:49:48

+2 A:

Are you trying to write something like a URL shortener? If so, just generate a random string, then use a big hash table, relational database (with indexes), etc. to relate keys (S3jvZLSDK) to URLs (google.com) and vice versa.

That will give you an easy solution for handling collisions (key already exists, URL already exists) and fast lookups.

lost-theory 2009-09-30 22:59:42

ansaurus

tags:

views:

answers:

What is the best reversable hash algorithm for a URL? (near-Zero collision!)

related questions