views:

356

answers:

7

Given a collection of random points within a grid, how do you check efficiently that they are all lie within a fixed range of other points. ie: Pick any one random point you can then navigate to any other point in the grid.

To clarify further: If you have a 1000 x 1000 grid and randomly placed 100 points in it how can you prove that any one point is within 100 units of a neighbour and all points are accessible by walking from one point to another?

I've been writing some code and came up with an interesting problem: Very occasionally (just once so far) it creates an island of points which exceeds the maximum range from the rest of the points. I need to fix this problem but brute force doesn't appear to be the answer.

It's being written in Java, but I am good with either pseudo-code or C++.

+1  A: 
csl
Heh, I am surprised I didn't think of that as I used a recursive algorithm to get an effective distribution across the grid, but when I went for a 'blob' within range of the centre point I started along the lines of brute-force.
graham.reeds
Hmm. I think your method is sufficient but not necessary -- i.e. if it returns TRUE then the points are adequate, but if it returns FALSE then that doesn't mean the points are inadequate.
Jason S
+1  A: 

Find the convex hull of the point set, and then use the rotating calipers method. The two most distant points on the convex hull are the two most distant points in the set. Since all other points are contained in the convex hull, they are guaranteed to be closer than the two extremal points.

zvrba
Either you've misunderstood what I need or I've misunderstood what you are getting at. I don't see how the rotating calipers would aid me with my problem?
graham.reeds
After having re-read your question again for several times, it turns out that I've misunderstood what you need.
zvrba
A: 

Force the desired condition by construction. Instead of placing all points solely by drawing random numbers, constrain the coordinates as follows:

  1. Randomly place an initial point.

  2. Repeat for the remaining number of points (e.g. 99):

    2.1. Randomly select an x-coordinate within some range (e.g. 90) of the previous point.

    2.2. Compute the legal range for the y-coordinate that will make it within 100 units of the previous point.

    2.3. Randomly select a y-coordinate within that range.

  3. If you want to completely obscure the origin, sort the points by their coordinate pair.

This will not require much overhead vs. pure randomness, but will guarantee that each point is within 100 units of at least one other point (actually, except for the first and last, each point will be within 100 units of two other points).

As a variation on the above, in step 2, randomly choose any already-generated point and use it as the reference instead of the previous point.

joel.neely
you will loose the uniform density with this approach.
Guillaume
+2  A: 

I like @joel.neely 's construction approach but if you want to ensure a more uniform density this is more likely to work (though it would probably produce more of a cluster rather than an overall uniform density):

  1. Randomly place an initial point P_0 by picking x,y from a uniform distribution within the valid grid
  2. For i = 1:N-1
    1. Choose random j = uniformly distributed from 0 to i-1, identify point P_j which has been previously placed
    2. Choose random point P_i where distance(P_i,P_j) < 100, by repeating the following until a valid P_i is chosen in substep 4 below:
      1. Choose (dx,dy) each uniformly distributed from -100 to +100
      2. If dx^2+dy^2 > 100^2, the distance is too large (fails 21.5% of the time), go back to previous step.
      3. Calculate candidate coords(P_i) = coords(P_j) + (dx,dy).
      4. P_i is valid if it is inside the overall valid grid.
Jason S
Just to add, in step 2 you don't need to do the full distance calculation. Since this is on a grid, the distance travelled would simply by the manhattan distance. |x| + |y|
ReaperUnreal
@ReaperUnreal: the original poster didn't specify which metric he was using (Euclidean or taxicab).
joel.neely
+1  A: 

As far as evaluating existing sets of points, this looks like a type of Euclidean minimum spanning tree problem. The wikipedia page states that this is a subgraph of the Delaunay triangulation; so I would think it would be sufficient to compute the Delaunay triangulation (see prev. reference or google "computational geometry") and then the minimum spanning tree and verify that all edges have length less than 100.

From reading the references it appears that this is O(N log N), maybe there is a quicker way but this is sufficient.

A simpler (but probably less efficient) algorithm would be something like the following:

  1. Given: the points are in an array from index 0 to N-1.
  2. Sort the points in x-coordinate order, which is O(N log N) for an efficient sort.
  3. Initialize i = 0.
  4. Increment i. If i == N, stop with success. (All points can be reached from another with radius R)
  5. Initialize j = i.
  6. Decrement j.
  7. If j<0 or P[i].x - P[j].x > R, Stop with failure. (there is a gap and all points cannot be reached from each other with radius R)
  8. Otherwise, we get here if P[i].x and P[j].x are within R of each other. Check if point P[j] is sufficiently close to P[i]: if (P[i].x-P[j].x)^2 + (P[i].y-P[j].y)^2 < R^2`, then point P[i] is reachable by one of the previous points within radius R, and go back to step 4.
  9. Keep trying: go back to step 6.

Edit: this could be modified to something that should be O(N log N) but I'm not sure:

  1. Given: the points are in an array from index 0 to N-1.
  2. Sort the points in x-coordinate order, which is O(N log N) for an efficient sort.
  3. Maintain a sorted set YLIST of points in y-coordinate order, initializing YLIST to the set {P[0]}. We'll be sweeping the x-coordinate from left to right, adding points one by one to YLIST and removing points that have an x-coordinate that is too far away from the newly-added point.
  4. Initialize i = 0, j = 0.
  5. Loop invariant always true at this point: All points P[k] where k <= i form a network where they can be reached from each other with radius R. All points within YLIST have x-coordinates that are between P[i].x-R and P[i].x
  6. Increment i. If i == N, stop with success.
  7. If P[i].x-P[j].x <= R, go to step 10. (this is automatically true if i == j)
  8. Point P[j] is not reachable from point P[i] with radius R. Remove P[j] from YLIST (this is O(log N)).
  9. Increment j, go to step 6.
  10. At this point, all points P[j] with j<i and x-coordinates between P[i].x-R and P[i].x are in the set YLIST.
  11. Add P[i] to YLIST (this is O(log N)), and remember the index k within YLIST where YLIST[k]==P[i].
  12. Points YLIST[k-1] and YLIST[k+1] (if they exist; P[i] may be the only element within YLIST or it may be at an extreme end) are the closest points in YLIST to P[i].
  13. If point YLIST[k-1] exists and is within radius R of P[i], then P[i] is reachable with radius R from at least one of the previous points. Go to step 5.
  14. If point YLIST[k+1] exists and is within radius R of P[i], then P[i] is reachable with radius R from at least one of the previous points. Go to step 5.
  15. P[i] is not reachable from any of the previous points. Stop with failure.
Jason S
+1  A: 

New and Improved ;-)

Thanks to Guillaume and Jason S for comments that made me think a bit more. That has produced a second proposal whose statistics show a significant improvement.

Guillaume remarked that the earlier strategy I posted would lose uniform density. Of course, he is right, because it's essentially a "drunkard's walk" which tends to orbit the original point. However, uniform random placement of the points yields a significant probability of failing the "path" requirement (all points being connectible by a path with no step greater than 100). Testing for that condition is expensive; generating purely random solutions until one passes is even more so.

Jason S offered a variation, but statistical testing over a large number of simulations leads me to conclude that his variation produces patterns that are just as clustered as those from my first proposal (based on examining mean and std. dev. of coordinate values).

The revised algorithm below produces point sets whose stats are very similar to those of purely (uniform) random placement, but which are guaranteed by construction to satisfy the path requirement. Unfortunately, it's a bit easier to visualize than to explain verbally. In effect, it requires the points to stagger randomly in a vaguely consistant direction (NE, SE, SW, NW), only changing directions when "bouncing off a wall".

Here's the high-level overview:

  1. Pick an initial point at random, set horizontal travel to RIGHT and vertical travel to DOWN.

  2. Repeat for the remaining number of points (e.g. 99 in the original spec):

    2.1. Randomly choose dx and dy whose distance is between 50 and 100. (I assumed Euclidean distance -- square root of sums of squares -- in my trial implementation, but "taxicab" distance -- sum of absolute values -- would be even easier to code.)

    2.2. Apply dx and dy to the previous point, based on horizontal and vertical travel (RIGHT/DOWN -> add, LEFT/UP -> subtract).

    2.3. If either coordinate goes out of bounds (less than 0 or at least 1000), reflect that coordinate around the boundary violated, and replace its travel with the opposite direction. This means four cases (2 coordinates x 2 boundaries):

    2.3.1. if x < 0, then x = -x and reverse LEFT/RIGHT horizontal travel.
    2.3.2. if 1000 <= x, then x = 1999 - x and reverse LEFT/RIGHT horizontal travel.
    2.3.3. if y < 0, then y = -y and reverse UP/DOWN vertical travel.
    2.3.4. if 1000 <= y, then y = 1999 - y and reverse UP/DOWN vertical travel.
    

Note that the reflections under step 2.3 are guaranteed to leave the new point within 100 units of the previous point, so the path requirement is preserved. However, the horizontal and vertical travel constraints force the generation of points to "sweep" randomly across the entire space, producing more total dispersion than the original pure "drunkard's walk" algorithm.

joel.neely
+1  A: 

If I understand your problem correctly, given a set of sites, you want to test whether the nearest neighbor (for the L1 distance, i.e. the grid distance) of each site is at distance less than a value K.

This is easily obtained for the Euclidean distance by computing the Delaunay triangulation of the set of points: the nearest neighbor of a site is one of its neighbor in the Delaunay triangulation. Interestingly, the L1 distance is greater than the Euclidean distance (within a factor sqrt(2)).

It follows that a way of testing your condition is the following:

  1. compute the Delaunay triangulation of the sites
  2. for each site s, start a breadth-first search from s in the triangulation, so that you discover all the vertices at Euclidean distance less than K from s (the Delaunay triangulation has the property that the set of vertices at distance less than K from a given site is connected in the triangulation)
  3. for each site s, among these vertices at distance less than K from s, check if any of them is at L1 distance less than K from s. If not, the property is not satisfied.

This algorithm can be improved in several ways:

  1. the breadth-first search at step 2 should of course be stopped as soon as a site at L1 distance less than K is found.
  2. during the search for a valid neighbor of s, if a site s' is found to be at L1 distance less than K from s, there is no need to look for a valid neighbor for s': s is obviously one of them.
  3. a complete breadth-first search is not needed: after visiting all triangles incident to s, if none of the neighbors of s in the triangulation is a valid neighbor (i.e. a site at L1 distance less than K), denote by (v1,...,vn) the neighbors. There are at most four edges (vi, vi+1) which intersect the horizontal and vertical axis. The search should only be continued through these four (or less) edges. [This follows from the shape of the L1 sphere]
Camille