ansaurus

Question

Can I force two components in a three-way linear regression to be positive?

Answer 1

+1 A:

Let me try and rephrase to clarify. Accoring to your model z is always positively correlated with x and y. However, sometimes when you solve the linear regression for the coefficient this gives you a negative value.

If you are right about the data, this should only happen when the correct coefficient is small, and noise happens to take it negative. You could just assign it to zero, but then the means wouldn't match properly.

In which case the correct solution is as jpalacek says, but explained with more detail here:

Try and regress against x and y. If both positive take the result.
If a1 is negative, assume it should be zero. regress z against y. If a2 is positive then take a1 as 0, and a0 and a2 from this regression.
If a2 is negative, assume it should be zero too. Regress z against 1, and take this as a0. Let a1 and a2 be 0.

This should give you what you want.

Nick Fortescue 2009-02-20 12:40:03

Answer 2

+1 A:

The simple solution is to use a tool designed to solve it. That is, use lsqlin, from the optimization toolbox. Set a lower bound constraint for two of the three parameters.

Thus, assuming x, y, and z are all COLUMN vectors,

A = [ones(length(x),1),x,y];

lb = [-inf, 0, 0];

a = lsqlin(A,z,[],[],[],[],lb);

This will constrain only the second and third unknown parameters.

Without the optimization toolbox, use lsqnonneg, which is part of matlab itself. Here too the solution is easy enough.

A = [ones(length(x),1),x,y];

a = lsqnonneg(A,z);

Your model will be

z = a(1) + a(2)x + a(3)y

If a(1) is essentially zero, i.e., it is within a tolerance of zero, then assume that the first parameter was constrained by the bound at zero. In that case, solve a second problem by changing the sign on the column of ones in A.

A(:,1) = -1;

a = lsqnonneg(A,z);

If this solution has a(1) significantly non-zero, then the second solution must be better than the first. Your model will now be

z = -a(1) + a(2)x + a(3)y

It costs you at most two calls to lsqnonneg, and the second call is only ever made some fraction (lacking any information about your problem, the odds are 50% of the second call) of the time.

woodchips 2009-03-31 16:28:16

ansaurus

tags:

views:

answers:

Can I force two components in a three-way linear regression to be positive?

related questions