Dividing a plane of points into two equal halves

views:

369

answers:

+13 Q:

Dividing a plane of points into two equal halves

Given a 2 dimensional plane in which there are n points. I need to generate the equation of a line that divides the plane such that there are n/2 points on one side and n/2 points on the other. (by the way this not home work, I am just trying to solve the problem)

+1 A:

I'd guess that a good way is to sort/sequence/order the points (e.g. from left to right), and then choose a line which passes through (or between) the middle point[s] in the sequence.

ChrisW 2010-06-23 23:40:42

@chrisW if all points lie on a vertical line it doesnt work

mousey 2010-06-23 23:43:12

@chrisW: What if all the points have the same x value? Also, if exactly half are on one side and half on the other, the line can't pass through any of them (maybe one, if n is odd).

BlueRaja - Danny Pflughoeft 2010-06-23 23:43:16

+6 A:

Create an arbitrary line in that plane. Project each point onto that line a.k.a for each point, get the closest point on that line to that point.
Order those points along the line in either direction, and choose a point on that line such that there is an equal number of points on the line in either direction.
Get the line perpendicular to the first line which passes through that point. This line will have half the original points on either side.

There are some cases to avoid when doing this. Most importantly, if all the point are themselves on a single line, don't choose a perpendicular line which passes through it. In fact, choose that line itself so you don't have to worry about projecting the points. In terms of the actual mathematics behind this, vector projections will be very useful.

tlayton 2010-06-23 23:45:46

Same problem as Chris' solution, multiple points could have same projection.

BlueRaja - Danny Pflughoeft 2010-06-23 23:47:45

That's only an issue if those points are the two bracketing the middle. In that case, choose a different line in part 1.

tlayton 2010-06-23 23:50:13

So you are suggesting "choose a random line, see if you can fit it to split the set in two, otherwise choose another random line?"

BlueRaja - Danny Pflughoeft 2010-06-23 23:54:06

The final resulting line will be perpendicular to the line you choose. So yes, but the new line will give you a different result. But if you encounter this problem, it means that that particular set of points can't be split like this with a line in that particular direction.

tlayton 2010-06-23 23:59:43

There is a deterministic way (though O(nlogn) currently) to find without any guessing (see my answer). Guessing is more practical though, as the chances of getting a bad guess are negligible.

Moron 2010-06-24 02:39:25

+1 A:

There are obvious cases where no solution is possible. E.g. when you have three heaps of points. One point at location A, Two points at location B, and five points at location C.

If you expect some decent distribution, you can probably get a good result with tlayton's algorithm. To select the initial line slant, you could determine the extent of the whole point set, and choose the angle of the largest diagonal.

relet 2010-06-23 23:53:25

I don't find it obvious - in fact, I don't think it's true. I can't imagine a scenario where you couldn't divide C in half such that A and three points from C are in one half, B and two points from C are in the other.

BlueRaja - Danny Pflughoeft 2010-06-23 23:56:36

You cannot divide C in half. Either all five points are on the line, or they are not.

relet 2010-06-23 23:58:50

Oh I see what you are saying - I was under the impression all n points were distinct.

BlueRaja - Danny Pflughoeft 2010-06-24 00:00:05

Heh, indeed. That needs to be clarified. I was thinking of n points as distinct objects in a list, which may refer to the same coordinates. :)

relet 2010-06-24 00:00:57

I dont know how useful this is I have seen a similar problem...

If you already have the directional vector (aka the coefficients of the dimensions of your plane).

You can then find two points inside that plane, and by simply using the midpoint formula you can find the midpoint of that plane.

Then using the coefficients of that plane and the midpoint you can find a plane that is from equal distance from both points, using the general equation of a plane.

A line then would constitute in expressing one variable in terms of the other so you would find a line with equal distance between both planes.

There are different methods of doing this such as projection using the distance equation from a plane but I believe that would complicate your math a lot.

2010-06-24 00:08:29

+11 A:

I have assumed the points are distinct, otherwise there might not even be such a line.

If points are distinct, then such a line always exists and is possible to find using a deterministic O(nlogn) time algorithm.

Say the points are P1, P2, ..., P2n. Assume they are not all on the same line. If they were, then we can easily form the splitting line.

First translate the points so that all the co-ordinates (x and y) are positive.

Now suppose we magically had a point Q on the y-axis such that no line formed by those points (i.e. any infinite line Pi-Pj) passes through Q.

Now since Q does not lie within the convex hull of the points, we can easily see that we can order the points by a rotating line passing through Q. For some angle of rotation, half the points will lie on one side and the other half will lie on the other of this rotating line, or, in other words, if we consider the points being sorted by the slope of the line Pi-Q, we could pick a slope between the (median)th and (median+1)th points. This selection can be done in O(n) time by any linear time selection algorithm without any need for actually sorting the points.

Now to pick the point Q.

Say Q was (0,b).

Suppose Q was collinear with P1 (x1,y1) and P2 (x2,y2).

Then we have that

(y1-b)/x1 = (y2-b)/x2 (note we translated the points so that xi > 0).

Solving for b gives

b = (x1y2 - y1x2)/(x1-x2)

(Note, if x1 = x2, then P1 and P2 cannot be collinear with a point on the Y axis).

Consider |b|.

|b| = |x1y2 - y1x2| / |x1 -x2|

Now let the xmax be the x-coordinate of the rightmost point and ymax the co-ordinate of the topmost.

Also let D be the smallest non-zero x-coordinate difference between two points (this exists, as not all xis are same, as not all points are collinear).

Then we have that |b| <= xmax*ymax/D.

Thus, pick our point Q (0,b) to be such that |b| > b_0 = xmax*ymax/D

D can be found in O(nlogn) time.

The magnitude of b_0 can get quite large and we might have to deal with precision issues.

Of course, a better option is to pick Q randomly! With probability 1, you will find the point you need, thus making the expected running time O(n).

If we could find a way to pick Q in O(n) time (by finding some other criterion), then we can make this algorithm run in O(n) time.

Moron 2010-06-24 02:17:50

I'd been racking my brain all night trying to figure this one out, but you beat me again, M. As soon as you said *"suppose we magically had a point Q on the y-axis..."* it came to me; it seems so easy now... but, too late! +1, amazing work

BlueRaja - Danny Pflughoeft 2010-06-24 05:30:57

@BlueRaja: Thanks! O(n) seems so close, though...

Moron 2010-06-24 05:37:51

@Moron: in 2xmax*ymax/D, the 2 isn't necessary: |x1y2 - y1x2| <= xmax * ymax because xi*yj > 0.

Nicolas Viennot 2010-06-24 13:07:50

@Pafy. You are right. I have edited the answer to reflect that. Thanks!

Moron 2010-06-24 13:18:23

@Moron: You wrote: "...in other words, sort by the slope of the line Pi-Q and pick a slope between the (median)th and (median+1)th points. This can be done in O(n) time by any linear time selection algorithm." If you sorted first, wouldn't picking the median be O(1)? Also, why not forgo the sort and just use the O(n) selection instead?

andand 2010-06-24 15:56:07

@andand: I talked about sorting only to visualize what is going on. We don't actually need to sort, as we need only the 'middle' element.

Moron 2010-06-24 16:07:43

@andand: I have edited the answer to make it clearer. Thanks!

Moron 2010-06-24 16:19:23

I didn't know before that selection is in O(n). Thanks for the clarification! But with that a deterministic O(n) time algortihm is also possible (see my answer).

rudi-moore 2010-06-24 17:09:14

I came up with a bit more reasonable value for b (ie. one that lies around the median y-value), but I can't seem to do it in O(n). It requires finding a lower bound for the minimum difference between two values in an unsorted list (or finding a lower-bound for the area of a triangle formed by b and any other two points), both of which I can't seem to do any better than O(n log n)

BlueRaja - Danny Pflughoeft 2010-06-24 18:39:40

@BlueRaja: Min difference between two points in unsorted list might help you solve Element Distinctness problem, so O(n) is doubtful for that. Of course, the model matters, so don't know. Min difference can probably be reduced to min-area: Take all points collinear, so min-difference accounts for area (as height is same).

Moron 2010-06-24 18:51:10

@M: That's disappointing to hear. I added my calculation for b anyways, though it is also `O(n log n)`

BlueRaja - Danny Pflughoeft 2010-06-24 21:15:19

+1 A:

andand 2010-06-24 02:18:44

+1 A:

I picked up the idea from Moron and andand and continued to form a deterministic O(n) algorithm.

I also assumed that the points are distinct and n is even (thought the algorithm can be changed so that uneven n with one point on the dividing line are also supported).

The algorithm tries to divide the points with a vertical line between them. This only fails if the points in the middle have the same x value. In that case the algorithm determines how many points with the same x value have to be on the left and lower site and and accordingly rotates the line.

I'll try to explain with an example. Let's asume we have 16 points on a plane. First we need to get the point with the 8th greatest x-value and the point with the 9th greatest x-value. With a selection algorithm this is possible in O(n), as pointed out in another answer. If the x-value of that points is different, we are done. We create a vertical line between that two points and that splits the points equal.

Problematically now is if the x-values are equal. So we have 3 sets of points. That on the left side (x < x_a), in the middle (x = x_a) and that on the right side (x > x_a). The idea now is to count the points on the left side and calculate how many points from the middle needs to go there, so that half of the points are on that side. We can ignore the right side here because if we have half of the points on the left side, the over half must be on the right side.

So let's asume we have we have 3 points (c=3) on the left side, 6 in the middle and 7 on the right side (the algorithm doesn't know the count from the middle or right side, because it doesn't need it, but we could also determine it in O(n)). We need 8-3=5 points from the middle to go on the left side. The points we already got in the first step are useless now, because they are only determined by the x-value and can be any of the points in the middle.

We want the 5 points from the middle with the lowest y-value on the left side and the point with the highest y-value on the right side. Again using the selection algorithm, we get the point with the 5th greatest y-value and the point with the 6th greatest y-value. Both points will have the x-value equal to x_a, else we wouldn't get to this step, because there would be a vertical line.

Now we can create the point Q in the middle of that two points. Thats one point from the resulting line. Another point is needed, so that no points from the left or right side are divided. To get that point we need the point from the left side, that has the lowest angle (b_h) between the the vertical line at x_a and the line determined by that point and Q. We also need that point from the right side (with angle a_g). The new point R is between the point with the lower angle and a point on the vertical line (if the lower angle is on the left side a point above Q and if the lower angle is on the right side a point below Q).

The line determined by Q and R divides the points in the middle so that there are a even number of points on both sides. It doesn't divide any points on the left or right side, because if it would that point would have a lower angle and would have been choosen to calculate R.

From the view of a mathematican that should work well in O(n). For computer programs it is fairly easy to find a case where precision becomes a problem. An example with 4 points would be A(0, 100000000), B(0, 100000001), C(0, 0), D(0.0000001, 0). In this example Q would be (0, 100000000.5) and R (0.00000005, 0). Which gives B and C on the left side and A and D on the right side. But it is possible that A and B are both on the dividing line, because of rounding errors. Or maybe only one of them. So it belongs to the input values if this algorithm suits to the requirements.

plane of points example

get that two points P_a(x_a, y_a) and P_b(x_b, y_b)
which are the medians based on the x values > O(n)
if x_a != x_b you can stop here
because a y-axis parallel line between that two points is the result > O(1)
get all points where the x value equals x_a > O(n)
count points with x value less than x_a as c > O(n)
get the lowest point P_c based on the y values from the points from 3. > O(n)
get the greatest point P_d based on the y values from the points from 3. > O(n)
get the (n/2-c)th greatest point P_e based on the y values from the points from 3. > O(n)
also get the next greatest point P_f based on the y values from the points from 3. > O(n)
create a new point Q (x_a, (y_e+y_f)/2) between P_e and P_f > O(1)
for all points P_i calculate
the angle a_i between P_c, Q and P_i and
the angle b_i between P_d, Q and P_i > O(n)
get the point P_g with the lowest angle a_g (with a_g>0° and a_g<180°) > O(n)
get the point P_h with the lowest angle b_h (with b_h>0° and b_h<180°) > O(n)
if there aren't any P_g or P_h (all points have same x value)
create a new point R (x_a+1, 0) anywhere but with a different x value than x_a
else if a_g is lower than b_h
create a new point R ((x_c+x_g)/2, (y_c+y_g)/2) between P_c and P_g
else
create a new point R ((x_d+x_h)/2, (y_d+y_h)/2) between P_d and P_h > O(1)
the line determined by Q and R divides the points > O(1)

rudi-moore 2010-06-24 16:13:33

@Rudi-moore: Can you please explain what you are trying to do, rather than just give the steps? It is a bit hard to read. Also, few questions, what happens if a_g = b_h? Also, is it possible that the line QR divides points to the right of points in step 3? (i.e x-cordinate > x_a) in which case you might not get an n/2-n/2 split?

Moron 2010-06-24 17:51:05

@Rudi-moore: Rotating the line around Q might not work. I believe (though not completely sure) we can give you a configuration of points such that no line through Q will make an even split. You seem to be completely ignoring the points to the right of the vertical line. Also, the case when a_h = b_h and c=0 is undiscussed.

Moron 2010-06-24 21:13:53

@M Sorry i'm not a native speaker, mabye that makes it a bit harder to understand. I edited the post and hope to make it clearer with an example. For your question: if a_g = b_h you can choose P_g or P_h. It doesn't matter. Both will generate the same line. It is not possible that QR divides points on the right side, if it would another R would have been choosen. I'm not completly ignoring points on the right side. Only in the first steps. c=0 is only problematically if more than n/2 points are in the middle and in that case n/2 points from the middle go to left side.

rudi-moore 2010-06-25 08:14:01

@Rudi-moore: Good job! This looks right to me. It was simpler than I imagined it would be. +1.

Moron 2010-06-25 12:58:31

@mousey, @BlueRaja: I suggest you take a look at this answer. I believe this works! I think this derserves to be the accepted answer.

Moron 2010-06-25 12:59:25

+1 A:

To add to M's answer: a method to generate a Q (that's not so far away) in O(n log n).

To begin with, let Q be any point on the y-axis ie. Q = (0,b) - some good choices might be (0,0) or (0, (y_max-y_min)/2).

Now check if there are two points (x₁, y₁), (x₂, y₂) collinear with Q. The line between any point and Q is y = mx + b; since b is constant, this means two points are collinear with Q if their slopes m are equal. So determine the slopes m_i for all points and check if there are any duplicates: (amoritized O(n) using a hash-table)

If all the m's are distinct, we're done; we found Q, and M's algorithm above generates the line in O(n) steps.
If two points are collinear with Q, we'll move Q up just a tiny amount ε, Q_new = (0, b + ε), and show that Q_new will not be collinear with two other points.

The criterion for ε, derived below, is:

ε < m_minΔ*x_min

To begin with, our m's look like this:

m_i = y_i/x_i - b/x_i

Let's find the minimum difference between any two distinct m_i and call it m_minΔ (O(n log n) by, for instance, sorting then comparing differences between m_i and _i+1 for all i)

If we fudge b up by ε, the new equation for m becomes:

m_i,new = y_i/x_i - b/x_i - ε/x_i
       = m_i,old - ε/x_i

Since ε > 0 and x_i > 0, all m's are reduced, and all are reduced by a maximum of ε/x_min. Thus, if

ε/x_min < m_minΔ, ie.
ε < m_minΔ*x_min

is true, then two m_i which were previously unequal will be guaranteed to remain unequal.

All that's left is to show that if m_1,old = m_2,old, then m_1,new =/= m_2,new. Since both m_i were reduced by an amount ε/x_i, this is equivalent to showing x₁ =/= x₂. If they were equal, then:

y₁ = m_1,oldx₁ + b = m_2,oldx₂ + b = y₂

Contradicting our assumption that all points are distinct. Thus, m_{1, new} =/= m_{2, new}, and no two points are collinear with Q.

BlueRaja - Danny Pflughoeft 2010-06-24 21:12:53

We can assume x1 =/= x2 as then the line will be parallel to the y-axis.

Moron 2010-06-24 21:34:49

@M: Yep; that's essentially the same thing (it wouldn't necessarily be true if the points weren't distinct)

BlueRaja - Danny Pflughoeft 2010-06-24 22:36:14

ansaurus

tags:

views:

answers:

Dividing a plane of points into two equal halves

related questions