views:

43

answers:

2

So, I am looking at different hash functions to use for hashing a 4 tuple ip and port to identify flows.

One I came across was ((size_t)(key.src.s_addr) * 59) ^ ((size_t)(key.dst.s_addr)) ^ ((size_t)(key.sport) << 16) ^ ((size_t)(key.dport)) ^ ((size_t)(key.proto));

Now for the world of me, I cannot explain the prime used (59). why not 31, and then why go mess it up by multiplying the sport by a power of 2. Is there a better hash function to be used for ip addresses ?

A: 

Examine the output of the function for uniform distribution. If you don't like it, plug in some different primes until you get a distribution you like. Hashing can be a very dark art with no 'right' answer.

Michael Dorgan
+1  A: 

The prime number is used because when one value is multiplied by a prime number it tends to have a higher probability of remaining unique when other similar operations are accumulated on top of it. The specific value 59 may have been choosen arbitrarily or it may be intentional. It is hard to tell. It is possible that 59 tended to generate a better distribution of values based on the most likely inputs.

The shift by 16 may be because ports are limited to the range 2^16. The function appears to be moving the source port into the higher part of the bitfield while leaving the destination port in the lower part. I think this can be explained further in my next paragraph.

Another reason why the multiplication takes place (and this is true of the shift operation as well) is because it breaks down the associative nature of the hash function. Remember, XOR is associative so the IPs src=192.168.1.1 and dst=192.168.1.254 would hash to the same value as src=192.168.1.254 and dst=192.168.1.1 (swapped) if the multiplication were not there.

Brian Gideon
So, why not multiply each ip with the prime instead of just the source.
shrin
The hash function would become associative again because `(a*P)^(b*P) = (b*P)^(a*P)`. Though it is common to see two coprime numbers working in tandem to do something similar.
Brian Gideon