views:

304

answers:

4

In order to calculate the nearest locations that are represented by latitude/longitude, I was considering dividing the map into small grids, approximately 100x100 meter grids. Essentially each point would be assigned to a grid.

I understand that I could instead also use spatial indexes with MySQL etc, but am planning to use a non-relational database like Cassandra where it would be difficult to do indexing on spatial objects, and so some kind of grid approximation technique could be neat.

What would be the best way of creating such a grid system and mapping the 2-D spatial locations to it?

Edit1: It might be alright if the grids are not perfectly uniform, more so around the poles.

A: 

You can't create a rectangular grid which uniformly maps a globe. If the grid must be uniform, you must use triangles instead. But in general, I doubt that this will solve your issue. What you need is an 2D octree (this is a Google search link; check the images for an easy clue how this works) of some kind: You must divide your coordinates into hierarchies (for example north/south/east/west of the origin for the first level and then between 90 degrees, etc).

Then you can do a couple of selects which will quickly yield the smallest rectangle which does contain existing coordinates. Now, you can check the size of the rectangle. If it's < 100m, then you've found a solution. Otherwise, you'll have just a few positions to check against (usually one).

Google for "octree sql database" for implementations.

Aaron Digulla
A: 

Rectangular grids can be a reasonable estimation, but only over a relatively small area that isn't too close to the poles. A full-globe solution requires a different approach.

RickNZ
Thanks Rick. I think for my usecase, it might be acceptable if the grids are not uniform and more so not for the area around the poles.
Nishith
Non-uniform grids will happen naturally if you use polar coordinates (lat, long, radius) with the same number of angular distance on each edge. You might find indexing, etc, to be a little easier if you convert to Cartesian coordinates (x, y, z) instead. The best approach depends heavily on your specific requirements, including things like accuracy and speed.
RickNZ
A: 

Without knowing your exact application requirements Geohashing might be an appropriate technique: http://en.wikipedia.org/wiki/Geohash

"It is a hierarchical spatial data structure which subdivides space into buckets of grid shape. Geohashes offer properties like arbitrary precision and the possibility of gradually removing characters from the end of the code to reduce its size (and gradually lose precision)."

AndrewL
A: 

Mapping from the two-dimensional spatial coordinates to your spatial index / geohash is an interesting problem. You might look at this article on quadtrees, geohashes and Hilbert curves. The Hilbert curve is a space-filling curve that provides locality; for your purposes, that means that nearby items in the one-dimensional spatial index will be nearby in two-dimensional space.

The goal (as described by other responders) is to minimize the number of queries necessary to cover the space in question without requesting tons of unnecessary data from the server. How you do the mapping from 2-d space to a 1-d index will affect that goal.

npdoty