ansaurus

Question

Answer 1

+2 A:

You have not tested enough. Minimizing D1 + D2 is not the same as minimizing D1^2 + D2^2 in general, although it may be for some particular D1 and D2.

EDIT after you reminded me that you are only interested in distances in the plane:

In the case where D1 and D2 are distances in the geometric plane, the point in the plane that minimizes D1^2 + D2^2 also minimizes D1 + D2, but it breaks down with three points.

Try it with the three points (0,0), (1,0) and (10, 0): Minimizing |x|+|x-1|+|x-10| is not the same as minimizing x^2+(x-1)^2+(x-10)^2

Pascal Cuoq 2009-12-07 12:11:06

Ya I understand that. But since these are distances (all are positive) so I thought it may be the case. But for n=2, I have tested 15-20 cases. It gives the correct result. What is the correct way to solve the above problem.

nowonder 2009-12-07 12:13:31

but ~Ewan in his answer is saying that it is correct. Please clarify.

nowonder 2009-12-07 12:27:07

I think that Ewan did not understand that D1, ... Dn did not vary independently but as functions of the same x and y in his answer.

Pascal Cuoq 2009-12-07 12:35:43

How minimizing D1+D2 is a line?

nowonder 2009-12-07 12:42:27

@nwonder Sorry, I was thinking of something else for a minute. Please forget that part of my answer :)

Pascal Cuoq 2009-12-07 12:44:26

Answer 2

+6 A:

Even for 2d distances, it is not in general true that the minimum of a^2 + b^2 is at the same location as the minimum of a + b. It may be true for some very specific limited set of problems, of course. If you're trying to find a counterexample, note that the squares overpenalize long distances; if you construct an example with a minimum containing at least one long distance, you've a good chance that the sum of squares has a different minimum.

What is the problem you are trying to solve? It's quite possible that for your problem the distinction doesn't matter, of course; or that the minimum of the sum of squares is a cheaper problem and an easier first approximation to a final solution.

It may be obvious, but if the various distances are unrelated, then for each individual distance the square is minimal when the distance is and thus the sum of unrelated distances is minimal where the sum of the squares is.

Edit Post update: You're trying to find a centroid with the limitation that it lies on a particular line. In general outlines then: you've only one degree of freedom, and you can do plain differentiation. However, the result will have a sum of fractions with sqrt in the denominator; solving that algebraically in the general case isn't possible (AFAIK). I'm not 100% positive, but I think you're in luck in that your distance-sum has no local minimum except the global one; in that case newton's method will converge reliably and fast.

So, if you can verify the assumption that there is only one local minimum, you're home free, and even if you can, you can achieve a pretty good result fairly reliably and detect when it goes wrong simply by comparing your newton-method computed minimum with a few reality check points (say, the orthogonal projection of each point onto the line).

Eamon Nerbonne 2009-12-07 12:29:21

I did not understand "overpenalize" thing. Please elaborate.My problem is: I am given set of points on a plane. And I am given a equation of a line (2-D plane). Now I have to find a point on line such that sum of the distances from all the given points is minimized.

nowonder 2009-12-07 12:32:48

intuitively squares "increase" large numbers more than small numbers. So a problem where the minimum involves a large distance and several small distances is likely to expose a different minimum for the squares case, since that one large distance will dramatically impact the overal sum.

Eamon Nerbonne 2009-12-07 12:35:48

So what is the correct way to solve this problem?please Give some idea.

nowonder 2009-12-07 12:38:51

In you specific example, imagine three points directly on the line. If two points are on 0, and one point is at 10 (as measured along the line), then the minimum of the sum of distances from the centroid is at 0 (being 10 - you're 10 away from the one point at 10), but the minimum for the sum of squares will be at 10/3 (try it).

Eamon Nerbonne 2009-12-07 12:40:44

What is AFAIK in ur edit?

nowonder 2009-12-07 13:15:56

Means I should try solving the original equation by newton's method?

nowonder 2009-12-07 13:17:35

Thank you very much for your detailed answer. I did not know that newtons method is so useful. I just negelected that in my "numerical analysis" course last semester. I should have done that course properly.

nowonder 2009-12-07 13:23:20

AFAIK: as far as I know. Which in this case means I'm not sure ;-). Newton's method is conceptually pretty easy; wikipedia (as so often) has a fair explanation. The formula, in any case, is trivial. (Note that newton's method find's zero crossings, so you need to apply it to the differential, not the original distance-sum!). If newton's method diverges (which I'm almost positive it won't for your problem setting), (stochastic) gradient descent is even easier and more reliable, but also slower.

Eamon Nerbonne 2009-12-07 13:30:58

Ya at wikipedia entry,its written that http://en.wikipedia.org/wiki/Newton%27s_method#Practical_considerations , if for functions,taking out derivative is not easy, then we can use secant method. Can I use it?

nowonder 2009-12-07 13:34:17

well, the derivative shouldn't be that bad - give it a shot it'll end up looking like a sum of terms ala `(x-x0)/sqrt((x-x0)^2 + y0^2)` when the line is x-axis (which a transformation can ensure), or only slightly more complex if `y = mx + c` i.e. is a linear function of x rather than just 0 (and no transformation needed). I'm sure the secant method will work too, however - but realize that you need the zero crossing *of the derivative* so even for the secant-method, you won't escape taking at least *one* derivative.

Eamon Nerbonne 2009-12-07 14:15:23

Thank you very much for all your effort.

nowonder 2009-12-07 19:51:19

Answer 3

+2 A:

You are a bit unclear about the origin of your D1, D2, .. Dn but I assume you have a set of points P1, P2, ..., Pn in the x-y plane, and you want to find the point p0=(x0,y0) that minimises the sum of the distances between each point P1.. Pn and p0.

So your D1.. Dn are actually:

D1 = sqrt((x0-x1)^2 + (y0-y1)^2)
D2 = sqrt((x0-x2)^2 + (y0-y2)^2)
..
Dn = sqrt((x0-xn)^2 + (y0-yn)^2)

where x1 .. xn and y1 ... yn are known and x0, y0 are unknown. And you want to minimise D0:

D0 = D1+D2+......+Dn

If this is correct then you want to find the Geometric median. The wikipedia article should help you produce a solution.

Update: you state in a comment that point P0 should lie on a given line (please add this to your original problem statement). This means you can rewrite y0 as a function of x0:

y0 = a*x0 + b

where a and b are given. This reduces the complexity of your distance functions and makes derivation a serious possibility.

D1 = sqrt((x0-x1)^2 + (ax0+b-y1)^2)
D2 = sqrt((x0-x2)^2 + (ax0+b-y2)^2)
..
Dn = sqrt((x0-xn)^2 + (ax0+b-yn)^2)

But if the amount of points n is not too large I would just do a brute-force search* in the area of the line where x is close to the mean of x1 .. xn to find the point x-, y0 that minimises D0.

jilles de wit 2009-12-07 14:33:05

Interesting. Given the specific limitations the OP imposes and his (sane) aversion to mathematically complex solutions, I suspect the simpler newton's or more robust gradient descent solutions are easier to demonstrably get correct and more than fast enough - notably, wikipedia claims there is no exact algebraic solution anyhow, you need iterative refinement.

Eamon Nerbonne 2009-12-07 14:51:42

that is what I meant. The iterative solution given in the wikipedia page is as easy as it gets. By the very nature of this problem there is only one minimum, and there are no local minima, so convergence is guaranteed. Note by the way that centroid != Geometric median.

jilles de wit 2009-12-07 15:28:43

Thanks for giving the exact term "Geometric median". I was searching, what is it called.

nowonder 2009-12-07 16:13:44

Answer 4

A:

Counterexample:

d1=1 & d2=10 (sum=11 & sumOfSquares=101)

d1=6 & d2=6 (sum=12 & sumOfSquares=72)

Sum increased but sum of squares decreased.

Emilio M Bumachar 2009-12-07 14:59:29

Answer 5

A:

Your problem is the minimization of an objective function given by some norm of your distances. The distances are Euclidean, and as such represent the Euclidean norm between two vectors. In order to understand the difference between minimizing sum(ai) over sum(ai^2) I recommend you read the Wikipedia entry on Norms; bottom line, notice the following:

||x||2       <= ||x||1 <= sqrt(n)||x||2
||x||_\infty <= ||x||2 <= sqrt(n)||x||_\infty
||x||_\infty <= ||x||1 <=       n||x||_\infty

Where ||x||2 is the Euclidean norm, ||x||1 is sum(abs(x1)+abs(x2)+...+abs(xn)), and ||x||_\infty is max(abs(x1),abs(x2),...,abs(xn)). In your case all the numbers are positive (you already have the Euclidean norm, so you can seee the difference.

It may also be helpful (though much more difficult to fully grasp) to read the fantastic book by Golub and Van Loan, Matrix Computations.

Arrieta 2009-12-09 17:37:41

ansaurus

tags:

views:

answers:

Minimising distance: distance formula

related questions