tags:

views:

195

answers:

5

I have a 3D grid of size AxBxC with equal distance, d, between the points in the grid. Given a number of points, what is the best way of finding the distance to the closest point for each grid point (Every grid point should contain the distance to the closest point in the point cloud) given the assumptions below?

Assume that A, B and C are quite big in relation to d, giving a grid of maybe 500x500x500 and that there will be around 1 million points.

Also assume that if the distance to the nearest point exceds a distance of D, we do not care about the nearest point distance, and it can safely be set to some large number (D is maybe 2 to 10 times d)

Since there will be a great number of grid points and points to search from, a simple exhaustive:

for each grid point:
   for each point:
     if distance between points < minDistance:
       minDistance = distance between points

is not a good alternative.

I was thinking of doing something along the lines of:

create a container of size A*B*C where each element holds a container of points
for each point:
  define indexX = round((point position x - grid min position x)/d)
  // same for y and z
  add the point to the correct index of the container

for each grid point:
  search the container of that grid point and find the closest point
  if no points in container and D > 0.5d:
    search the 26 container indices nearest to the grid point for a closest point
  .. continue with next layer until a point is found or the distance to that layer 
        is greater than D

Basically: put the points in buckets and do a radial search outwards until a points is found for each grid point. Is this a good way of solving the problem, or are there better/faster ways? A solution which is good for parallelisation is preferred.

+1  A: 

Take a look at octrees. They're a data structure often used to partition 3d spaces efficiently in such a manner as to improve efficiency of lookups for objects that are near each other spatially.

Amber
+1  A: 

One approach, which may or may not suit your application, is to recast your thinking and define each grid 'point' to be the centre of a cube which divides your space into cells. You then have a 3D array of such cells and store the points in the cells -- choose the most appropriate data structure. To use your own words, put the points in buckets in the first place.

I guess that you may be running some sort of large scale simulation and the approach I suggest is not unusual in such applications. At each time step (if I've guessed correctly) you have to recalculate the distance from the cell to the nearest point, and move points from cells to cells. This will parallelise very easily.

EDIT: Googling around for particle-particle and particle-particle particle-mesh may throw up some ideas for you.

High Performance Mark
Also search for "3d voronoi diagram".
starblue
@starblue Since OP is working on a lattice, I don't understand the relevance of Voronoi diagrams.
High Performance Mark
OP has lattice points AND sample points. You'd make a Voronoi diagram with the sample points and query it with the lattice points. But 3d Voronoi diagrams are expensive, not sure it is the best strategy.
Keith Randall
The idea of building a 3D voronoi diagram for 10^6 points at each time step in a long-running scientific simulation makes my head hurt just to think about it :-(
High Performance Mark
+2  A: 

You can build a nearest neighbor search structure (Wikipedia) on your sample points, then ask it for each of your grid points. There are a bunch of algorithms mentioned on the Wikipedia page. Perhaps octtrees, kd-trees, or R-trees would be appropriate.

Keith Randall
+1  A: 

Actually, I think I have a better way to go, as the number of grid points is much larger than the number of sample points. Let |Grid| = N, |Samples| = M, then the nearest neighbor search algorithms will be something like O(N lg M), as you need to look up all N grid points, and each lookup is (best case) O(lg M).

Instead, loop over the sample points. Store for each grid point the closest sample point found so far. For each sample point, just check all grid points within distance D of the sample to see if the current sample is closer than any previously processed samples.

The running time is then O(N + (D/d)^3 M) which should be better when D/d is small.

Even when D/d is larger, you might still be OK if you can work out a cutoff strategy. For example, if we're checking a grid point distance 5 from our sample, and that grid point is already marked as being distance 1 from a previous sample, then all grid points "beyond" that grid point don't need to be checked because the previous sample is guaranteed to be closer than the current sample we're processing. All you have to do is (and I don't think it is easy, but should be doable) define what "beyond" means and figure out how to iterate through the grid to avoid doing any work for areas "beyond" such grid points.

Keith Randall
+1  A: 

A note on Keith Randall's method, expanding shells or cubes around the startpoints:
One can expand in various orders. Here's some python-style pseudocode:

S = set of 1m startpoints
near = grid 500x500x500 -> nearest s in S
    initially s for s in S, else 0
for r in 1 .. D:
    for s in S:
        nnew = 0 
        for p in shell of radius r around s:
            if near[p] == 0:
                near[p] = s
                nnew += 1
        if nnew == 0:
            remove s from S  # bonk, stop expanding from s

"Stop expanding from s early" is fine in 1d (bonk left, bonk right); but 2d / 3d shells are irregular.
It's easier / faster to do whole cubes in one pass:

near = grid 500x500x500 -> { dist, nearest s in S }
    initially { 0, s } for s in self, else { infinity, 0 }
for s in S:
    for p in approximatecube of radius D around s:
        if |p - s| < near[p].dist:  # is s nearer ?
            near[p] = { |p - s|, s }

Here "approximatecube" may be a full DxDxD cube, or you could lop off the corners like (here 2d)

0 1 2 3 4 
1 1 2 3 4 
2 2 3 4 4
3 3 4 4
4 4 4

Also fwiw, with erik's numbers, there are on average 500^3/1M ~ 2^7 ~ 5^3 empties per sample point. So I at first thought that 5x5x5 cubes around 1M sample points would cover most of the whole grid. Not so, ~ 1/e of the gridpoints stay empty -- Poisson distibution.

Denis