views:

37

answers:

2

Hello.

I am looking to take a hash range (md5 or sha1) and split it into n equal ranges.

For example, if m (num nodes) = 5, the entire hash range would be split by 5 so that there would be a uniform distribution of key ranges. I would like n=1 (node 1) to be from the beginning of the hash range to 1/5, 2 from 1/5 to 2/5, etc all the way to the end.

Basically, I need to have key ranges mapped to each n such that when I hash a value, it knows which n is going to take care of that range.

I am new to hashing and a little bit unsure of where I could start on solving this for a project. Any help you could give would be great.

A: 

If you can stand a little very hard to remove bias (any power of two is impossible to divide evenly in 5, so there has to be some bias), then modulo (% in C and many other languages with C-like syntax) is the way to divide the full range into 5 almost identically-sized partitions.

Any message m with md5(m)%5==0 is in the first partition, etc.

Pascal Cuoq
A: 

If you are looking to place a hash value into a number of "buckets" evenly, then some simple math will do the trick. Watch out for rounding edge cases... You would be better to use a power of 2 for the BUCKETS value.

This is python code, by the way, which supports large integers...

BUCKETS    = 5
BITS       = 160

BUCKETSIZE = 2**BITS / BUCKETS

int('ad01c5b3de58a02a42367e33f5bdb182d5e7e164', 16) / BUCKETSIZE == 3
int('553ae7da92f5505a92bbb8c9d47be76ab9f65bc2', 16) / BUCKETSIZE == 1
int('001c7c8c5ff152f1cc8ed30421e02a898cfcfb23', 16) / BUCKETSIZE == 0
gahooa