views:

1420

answers:

8

I have an application where a Hilbert R-Tree (wikipedia) (citeseer) would seem to be an appropriate data structure. Specifically, it requires reasonably fast spatial queries over a data set that will experience a lot of updates.

However, as far as I can see, none of the descriptions of the algorithms for this data structure even mention how to actually calculate the requisite Hilbert Value; which is the distance along a Hilbert Curve to the point.

So any suggestions for how to go about calculating this?

+4  A: 

Fun question!

I did a bit of googling, and the good news is, I've found an implementation of Hilbert Value.

The potentially bad news is, it's in Haskell...

http://www.serpentine.com/blog/2007/01/11/two-dimensional-spatial-hashing-with-space-filling-curves/

It also proposes a Lebesgue distance metric you might be able to compute more easily.

Mike G.
Thank you, that looks to be precisely what I was struggling to find (and Haskell is fine :)
wrt
+2  A: 

See uzaygezen.

A: 

Java implementation is http://www.lrdev.com/lr/java/hilbert.java

This implementation is a simple recursive drawing algorithm and does not calculate the value itself.
Laurynas Biveinis
A: 

Suggestion: A good simple efficient data structure for spatial queries is a multidimensional binary tree.

In a traditional binary tree, there is one "discriminant"; the value that's used to determine whether you take the left branch or the right branch. This can be considered to be the one-dimensional case.

In a multidimensional binary tree, you have multiple discriminants; consecutive levels use different discriminants. For example, for two dimensional spacial data, you could use the X and Y coordinates as discriminants. Consecutive levels would use X, Y, X, Y...

For spatial queries (for example finding all nodes within a rectangle) you do a depth-first search of the tree starting at the root, and you use the discriminant at each level to avoid searching down branches that contain no nodes in the given rectangle.

This allows you to potentially cut the search space in half at each level, making it very efficient for finding small regions in a massive data set. (BTW, this data structure is also useful for partial-match queries, i.e. queries that omit one or more discriminants. You just search down both branches at levels with an omitted discriminant.)

A good paper on this data structure: http://portal.acm.org/citation.cfm?id=361007 This article has good diagrams and algorithm descriptions: http://en.wikipedia.org/wiki/Kd-tree

+2  A: 

Below is my java code adapted from C code in the paper "Encoding and decoding the Hilbert order" by Xian Lu and Gunther Schrack, published in Software: Practice and Experience Vol. 26 pp 1335-46 (1996).

Hope this helps. Improvements welcome !

Michael

/**
 * Find the Hilbert order (=vertex index) for the given grid cell 
 * coordinates.
 * @param x cell column (from 0)
 * @param y cell row (from 0)
 * @param r resolution of Hilbert curve (grid will have Math.pow(2,r) 
 * rows and cols)
 * @return Hilbert order 
 */
public static int encode(int x, int y, int r) {

    int mask = (1 << r) - 1;
    int hodd = 0;
    int heven = x ^ y;
    int notx = ~x & mask;
    int noty = ~y & mask;
    int temp = notx ^ y;

    int v0 = 0, v1 = 0;
    for (int k = 1; k < r; k++) {
        v1 = ((v1 & heven) | ((v0 ^ noty) & temp)) >> 1;
        v0 = ((v0 & (v1 ^ notx)) | (~v0 & (v1 ^ noty))) >> 1;
    }
    hodd = (~v0 & (v1 ^ x)) | (v0 & (v1 ^ noty));

    return interleaveBits(hodd, heven);
}

/**
 * Interleave the bits from two input integer values
 * @param odd integer holding bit values for odd bit positions
 * @param even integer holding bit values for even bit positions
 * @return the integer that results from interleaving the input bits
 *
 * @todo: I'm sure there's a more elegant way of doing this !
 */
private static int interleaveBits(int odd, int even) {
    int val = 0;
    int n = Math.max(Integer.highestOneBit(odd), Integer.highestOneBit(even));

    for (int i = 0; i < n; i++) {
        int bitMask = 1 << i;
        int a = (even & bitMask) > 0 ? (1 << (2*i)) : 0;
        int b = (odd & bitMask) > 0 ? (1 << (2*i+1)) : 0;
        val += a + b;
    }

    return val;
}
michael
A: 

Michael,

thanks for your Java code! I tested it and it seems to work fine, but I noticed that the bit-interleaving function overflows at recursion level 7 (at least in my tests, but I used long values), because the "n"-value is calculated using highestOneBit()-function, which returns the value and not the position of the highest one bit; so the loop does unnecessarily many interleavings.

I just changed it to the following snippet, and after that it worked fine.

  int max = Math.max(odd, even);
  int n = 0;
  while (max > 0) {
    n++;
    max >>= 1;
  }
+2  A: 

Hey Michael,

I figured out a slightly more efficient way to interleave bits. It can be found at the Stanford Graphics Website. I included a version that I created that can interleave two 32 bit integers into one 64 bit long.

public static long spreadBits32(int y) {
    long[] B = new long[] {
        0x5555555555555555L, 
        0x3333333333333333L,
        0x0f0f0f0f0f0f0f0fL,
        0x00ff00ff00ff00ffL,
        0x0000ffff0000ffffL,
        0x00000000ffffffffL
    };

    int[] S = new int[] { 1, 2, 4, 8, 16, 32 };
    long x = y;

    x = (x | (x << S[5])) & B[5];
    x = (x | (x << S[4])) & B[4];
    x = (x | (x << S[3])) & B[3];
    x = (x | (x << S[2])) & B[2];
    x = (x | (x << S[1])) & B[1];
    x = (x | (x << S[0])) & B[0];
    return x;
}

public static long interleave64(int x, int y) {
    return spreadBits32(x) | (spreadBits32(y) << 1);
}

Obviously, the B and S local variables should be class constants but it was left this way for simplicity.

Ryan Deak

In the original they don't do a 16-bit shift for 16-bit inputs, so you don't need to do a 32-bit shift for 32-bit inputs in the extended version. If you shift left by 32 bits then mask off the high 32-bits you'll always be left with zero, so this first shift-and-mask operation is effectively `x=x|0`, a no-op.
Tim Sylvester
A: 

PHP-Implementation is here: http://hilbert-curve.sourceforge.net

epitaph