tags:

views:

163

answers:

3

Yesterday I read a problem which can be translated into the following problem with slight modification:

The coordinate of a dot is expressed by (x, y) in a 2D space.

Input : An array of dots ARRAY = (x1, y1), (x2, y2), (x3, y3), ..., (xn, yn) and another dot D = (xi, yi)

Find the dot in the ARRAY which is nearest to D.

By saying "nearest", I am referring to the Euclidian distance.


There is an obvious solution: traverse each dot in the ARRAY and compute its Euclidian distance to D. Then, find the shortest distance. This can be done in O(N) time.

Can we do faster, say, in O(logN)? I am trying to use divide-and-conquer approach, but haven't come to a concrete idea yet.


Generalization of the problem: how to find the nearest dot in a K-dimensional space? Can we do it in less than O(N)?

+1  A: 

You can use a Bounding Volume Hierarchy. This does not improve on the worst-case complexity though, nor does it directly lead to an exact solution.

Ani
+5  A: 

If the array is not sorted in any way, then it is not possible to do any faster than O(n), as you have to check every point to make sure it is not closer than anything you have found so far.

You CAN do some preprocessing to sort the points or pack them into some structure, then do searches on that structure. In this case, while the preprocessing step may be slower than O(n), the individual searches may be faster. If you have many points to check against the same set of points, this can be advantageous.

slacker
Yeah, you are right. If the array is not sorted in any way, it is unlikely to do any faster than O(n).
SiLent SoNG
@SiLent SoNG:"Unlikely" is not the right word here. It is **impossible** to do any faster. You *have to* do *something* to each element to know that it isn't closer than any other you have already checked. Unless, of course you have found a point in the array that is exactly equal to the point D. Then you definitely won't find any other that is closer :).
slacker
+1  A: 

I have come out with a solution in O(logN) if the dots are sorted on x coordinate.

It uses a divide-and-conquer approach. I divide the dots array based on their x coordinate in the 2D space.

Consider the 2D case.

Assume each dot is expressed in the following data structure:

class Point {
    public float getX();
    public float getY();
}

We have two inputs: dot array ARRAY, and another dot D.

Initially, we would like to partition ARRAY into two parts: those dots that are on the "left" of D, and those dots are on the "right" of D.

int pivotIndex = partition(array, 0, array.length() - 1, d);

After partition, the dots with index less than pivotIndex have x coordinate less than d.getX(); the dots with index equal or greater than pivotIndex have x coordinate equal or greater than d.getX().

If all dots are on the left side of D, pivotIndex would be array.length() - 1. If all dots are on the right side of D, pivotIndex would be -1. If some dots are on the left of D and some dots are on the right of D, then pivotIndex would be between 0 and array.length() - 1. For dots having the same x coordinate as D, they are considered as on the "right" side.

Now, the next step is to search the nearest dot on each partition:

Point p1 = getNearestDot(array, 0, pivotIndex, d);
Point p2 = getNearestDot(array, pivotIndex + 1, array.length() - 1, d);

if (p1 == null) return p2;
if (p2 == null) return p1;

return nearer(p1, p2, d);

It is possible that all the dots in the ARRAY are on the left side of D, then p2 would be null in this case. Similarly, if all dots in the ARRAY are on the right side of D, then p1 would be null.

The algorithm of getNearestDot works as below:

// Find the nearest dot in array[low...high] inclusive which is closest to point d 
Point getNearestDot(Point[] array, int low, int high, Point d) {
    if (low > high)
        return null;
    if (low == high)
        return array[low];

    int middle = low + (high - low) >> 1;
    Point p1 = getNearestDot(array, low, middle, d);
    Point p2 = getNearestDot(array, middle + 1, high, d);

    if (p1 == null) return p2;
    if (p2 == null) return p1;

    return nearer(p1, p2, d);
} 

And finally, the function nearer(p1, p2, d) is to return either p1 or p2, whose distance to d is shorter.

SiLent SoNG
Yeah, that's the same algorithm that first came to my mind when I read this question. Note that in degenerate cases this algorithm is still linear. E.g. if all points have equal x coordinate. Personally, I would just use a simple quadtree instead.
slacker
How is that O(log(n))? Sure, it looks like a binary search which is O(log(n)), but you're always following both branches of the tree. It's at least O(n).
xan