tags:

views:

848

answers:

2

Can anybody explain the logic behind the use of djb2 hash function as a good option for strings. The algorithm can be found at http://www.cse.yorku.ca/~oz/hash.html

Why is it that 5381 and 33 hold such an importance in djb2 algorithm ???

Thanks, De Costo

+2  A: 

Maybe because 33 == 2^5 + 1 and many hashing algorithms use 2^n + 1 as their multiplier?

Credit to Jerome Berger

Update:

This seems to be borne out by the current version of the software package djb2 originally came from: cdb

The notes I linked to describe the heart of the hashing algorithm as using h = ((h << 5) + h) ^ c to do the hashing... x << 5 is a fast hardware way to use 2^5 as the multiplier.

John Weldon
+1  A: 

On 5381, Dan Bernstein (djb2) says in this article:

[...] practically any good multiplier works. I think you're worrying about the fact that 31c + d doesn't cover any reasonable range of hash values if c and d are between 0 and 255. That's why, when I discovered the 33 hash function and started using it in my compressors, I started with a hash value of 5381. I think you'll find that this does just as well as a 261 multiplier.

The whole thread is here if you're interested.

Ozan Yigit has a page on hash functions which says:

[...] the magic of number 33 (why it works better than many other constants, prime or not) has never been adequately explained.
Matt Curtis