tags:

views:

63

answers:

2

I want to devise an algorithm which takes a set of values and distributes it uniformly over a much larger range. eg. i have 1000 values and want to distribute them over a range of value 2^16. Also, the input values can change continuously and i need to keep parsing each input value through the hash function so that it gets distributed uniformly over my output range.

What hashing algorithm should i use for this? I am writing the code in Java.

A: 

If you're just hashing integers, here's one way.

public class Hasho {

    private static final Long LARGE_PRIME =  948701839L;
    private static final Long LARGE_PRIME2 = 6920451961L;

    public static void main(String[] args) {
        for (int i = 0; i < 100; i++) {
            System.out.println(i + " -> " + hash(i));
        }
    }

public static int hash(int i) {
    // Spread out values
    long scaled = (long) i * LARGE_PRIME;

    // Fill in the lower bits
    long shifted = scaled + LARGE_PRIME2;

    // Add to the lower 32 bits the upper bits which would be lost in
    // the conversion to an int.
    long filled = shifted + ((shifted & 0xFFFFFFFF00000000L) >> 32);

    // Pare it down to 31 bits in this case.  Replace 7 with F if you
    // want negative numbers or leave off the `& mask` part entirely.
    int masked = (int) (filled & 0x7FFFFFFF);
    return masked;
    }
}

This is merely an example to show how it can be done. There is some serious math in a professional quality hash function.

Tony Ennis
A: 

I'm sure this has a name, but this is what we used to do with ISAM files back in the dark ages

  1. Increment a number eg 16001
  2. Reverse the String ie. 10061 and you have your hash
  3. You might want to reverse the string bitwise

This produces a nice even spread. we used to use it with job numbers so that you could retrieve the job fairly easily, so if you have a 'magic number' candidate this can be useful.

MikeAinOz
This won't spread a small range of values over a large range, will it?
Tony Ennis
It will if the magic number has has the right number of digits. I was surprised when I first saw it running, on a six digit job number it spread them out nicely on the disk.
MikeAinOz