What's the algorithm for computing a least squares plane in (x, y, z) space, given a set of 3D data points? In other words, if I had a bunch of points like (1, 2, 3), (4, 5, 6), (7, 8, 9), etc., how would one go about calculating the best fit plane f(x, y) = ax + by + c? What's the algorithm for getting a, b, and c out of a set of 3D points?
All you'll have to do is to solve the system of equations.
If those are your points: (1, 2, 3), (4, 5, 6), (7, 8, 9)
That gives you the equations:
3=a*1 + b*2 + c
6=a*4 + b*5 + c
9=a*7 + b*8 + c
So your question actually should be: How do I solve a system of equations?
Therefore I recommend reading this SO question.
If I've misunderstood your question let us know.
EDIT:
Ignore my answer as you probably meant something else.
If you have n data points (x[i], y[i], z[i]), compute the 3x3 symmetric matrix A whose entries are:
sum_i x[i]*x[i], sum_i x[i]*y[i], sum_i x[i]
sum_i x[i]*y[i], sum_i y[i]*y[i], sum_i y[i]
sum_i x[i], sum_i y[i], n
Also compute the 3 element vector b:
{sum_i x[i]*z[i], sum_i y[i]*z[i], sum_i z[i]}
Then solve Ax = b for the given A and b. The three components of the solution vector are the coefficients to the least-square fit plane {a,b,c}.
Note that this is the "ordinary least squares" fit, which is appropriate only when z is expected to be a linear function of x and y. If you are looking more generally for a "best fit plane" in 3-space, you may want to learn about "geometric" least squares.
Note also that this will fail if your points are in a line, as your example points are.
It sounds like all you want to do is linear regression with 2 regressors. The wikipedia page on the subject should tell you all you need to know and then some.
As with any least-squares approach, you proceed like this:
Before you start coding
Write down an equation for a plane in some parameterization, say
0 = ax + by + z + d
in thee parameters(a, b, d)
.Find an expression
D(\vec{v};a, b, d)
for the distance from an arbitrary point\vec{v}
.Write down the sum
S = \sigma_i=0,n D^2(\vec{x}_i)
, and simplify until it is expressed in terms of simple sums of the components ofv
like\sigma v_x
,\sigma v_y^2
,\sigma v_x*v_z
...Write down the per parameter minimization expressions
dS/dx_0 = 0
,dS/dy_0 = 0
... which gives you a set of three equations in three parameters and the sums from the previous step.Solve this set of equations for the parameters.
(or for simple cases, just look up the form). Using a symbolic algebra package (like Mathematica) could make you life much easier.
The coding
- Write code to form the needed sums and find the parameters from the last set above.
Alternatives
Note that if you actually had only three points, you'd be better just finding the plane that goes through them.
Also, if the analytic solution in unfeasible (not the case for a plane, but possible in general) you can do steps 1 and 2, and use a Monte Carlo minimizer on the sum in step 3.