ansaurus

Question

Determining Perfect Hash Lookup Table for Pearson Hash

Answer 1

A:

If I understand you correctly, what you need is an sorted and no-duplicated-element array that you can do binary search on. If the key is in the array, the index is the "hash". Otherwise, you get the size of the array. It is O(nlogn) compares to lookup table O(1), but it is good enough for small number of elements - 256 in your case.

leiz 2009-09-09 09:26:56

The array doesn't need to be sorted, it's a hashmap using a perfect hash. The idea is to have lookup time be constant (insertion and removal don't occur).

Imagist 2009-09-10 05:22:57

Answer 2

A:

I strongly doubt that you will be able to find a solution with brute force if the number of member names is too high. Thanks to the birthday paradox the probability that no collisions exist (i.e., two hashes are the same) is approximately 1:5000 for 64 and 1:850,000,000 for 96 member names. From the structure of your hash function (it's derived from a cryptographic construction that is designed to "mix" things well) I don't expect that an algorithms exists that solves your problem (but I would definitely be interested in such a beast).

Your ideal world is an illusion (as you expected): there are 256 characters you can append to 'foo', no two of them giving a new word with a same hash. As there are only 256 possibilities for the hash values, you can therefore append a character to 'foo' so that its hash is the same as any of the hashes of 'foo', 'bar' or 'baz'.

Why don't you use an existing library like CMPH?

Whoever 2009-09-09 09:51:28

Answer 3

+1 A:

Have a look at this page about minimal perfect hashes - it references a few implementations and has a short section with some thoughts about minimal perfect Pearson hashes.

Daniel Brückner 2009-09-09 10:41:27

ansaurus

tags:

views:

answers:

Determining Perfect Hash Lookup Table for Pearson Hash

related questions