ansaurus

Question

Very low cost hash function

Answer 1

A:

Rewire bits in random order and take lower log2(n) bits

Or just take lower log2(n) bits if your data is evenly distributed.

Quassnoi 2009-01-16 21:55:41

Answer 2

+2 A:

CRC?

There is already a lot of hardware support for this too.

Adam Peck 2009-01-16 22:12:36

Answer 3

+4 A:

The canonical form of that is h(x) = (a*x + b) mod n, where a and b are constants and n is the size of your hash table. You want to make n a prime number, to get optimal(ish) distribution.

Note that this is sensitive to certain kind of distributions -- for example, just doing x mod n is mostly relying on randomness of low-order bits; if they are not random in your set, you will get fairly significant skew.

Bob Jenkins has designed several very good hashing functions; here's one specifically designed to be simple to implement in hardware: http://burtleburtle.net/bob/hash/nandhash.html

For a lot of different hash functions, design discussions, etc, see the rest of the site: http://burtleburtle.net/bob/hash/

SquareCog 2009-01-16 22:20:46

Don't you mean "...just doing _x_ mod n is mostly ..." ?

David Schmitt 2009-01-16 23:35:59

yes I do, thanks

SquareCog 2009-01-16 23:42:56

The b in (a*x+b) mod n won't affect anything, in that things that collide still will, and things that don't still won't.

A. Rex 2009-01-17 01:01:35

Answer 4

+2 A:

I believe this is the best possible hash for this problem (faster than modulo, better distribution), given that all your numbers in 0..N have the same probability:

h = z * n / N;

Where all values are integers, so you have an integer division. This way each value between 0..N is mapped to exactly the same number of values in n.

For example, when n=3 and N=7 (values 3 and 7 not included in the ranges), the hashes are this:

z * n / N = hash
----------------
0 * 3 / 7 = 0
1 * 3 / 7 = 0
2 * 3 / 7 = 0
3 * 3 / 7 = 1
4 * 3 / 7 = 1
5 * 3 / 7 = 2
6 * 3 / 7 = 2

So each hash values are used equally often, just off by 1. Just take care that n*(N-1) does not overflow.

If N is a power of 2, you can replace the division by shifting. e.g. if N=256:

h = (z * n) >> 8;

martinus 2009-01-16 22:22:21

Answer 5

+1 A:

If you're truly talking hardware (vs. software, or hardware implementation of software), and your number of hash buckets n can be written as n = 2^m - 1, the easiest is probably a maximum-length linear feedback shift register (LFSR) of which CRC is an instance.

Here's one way you could use an m-bit shift register to create a hash of a data packet (make sure all data is represented consistently as a K-bit string, if you have shorter strings then pad one end with zeros):

Initialize the state of the LFSR (CRC-32 uses all 1's; all zeros is probably bad)
Shift in the bits of your data
(Optional) Shift in an additional j zeros (j between m and 2m is probably a good choice); this adds some additional hashing to reduce direct correlation between input/output bits
Use the contents of the m-bit shift register as your hashed value.

Jason S 2009-01-16 22:56:29

Answer 6

A:

Bob Jenkins's FNV hash.

Mitch Wheat 2009-01-17 00:41:50

ansaurus

tags:

views:

answers:

Very low cost hash function

related questions