views:

87

answers:

3

I have a complicated combined model for which I can define a likelihood in a function, and I need to optimize the parameters. Problem is, the parameters go all directions if not restricted. Hence, I need to implement a restriction on the parameters, and the one proposed by the professor is that the sum of squared parameter values should equal 1.

I've been playing around with both the optim() and nlm() function, but I can't really get what I want. First idea was to use n-1 parameters and calculate the last one from the rest, but this doesn't work (as expected).

To illustrate, some toy data and function reflecting the core problem of what I want to achieve:

dd <- data.frame(
    X1=rnorm(100),
    X2=rnorm(100),
    X3=rnorm(100)
)
dd <- within(dd,Y <- 2+0.57*X1-0.57*X2+0.57*X3+rnorm(100,0,0.2))

myfunc2 <- function(alpha,dd){
    alpha <- c(alpha,sqrt(1-sum(alpha^2)))
    X <- as.matrix(dd[,-4]) %*% alpha
    m.mat <- model.matrix(~X)
    mod <- glm.fit(m.mat,dd$Y)
    Sq <- sum(resid(mod)^2)
    return(Sq)
}

b <- c(1,0)
optim(b,myfunc2,dd=dd)

This results obviously in :

Error: (subscript) logical subscript too long
In addition: Warning message:
In sqrt(1 - sum(alpha^2)) : NaNs produced

Anybody an idea on how to implement restrictions on parameters in optimization processes?

PS: I am aware of the fact that this example code doesn't make sense at all. It's just for demonstration purposes.


Edit : Solved it! - See Mareks answer.

+1  A: 

you need to provide more details about your constraint. if you are dealing with sum of squares equal to one, an elegant way to solve it using optim is to let the parameters entering optim unconstrained, and reparametrize them within your optimization function.

to illustrate my point, in the example you have stated above, you can get the optimization running by making the following changes to your code:

myfunc2 <- function(alpha,dd){
    alpha <- alpha^2/sum(alpha^2);
    X <- as.matrix(dd[,-4]) %*% alpha
    m.mat <- model.matrix(~X)
    mod <- glm.fit(m.mat,dd$Y)
    Sq <- sum(resid(mod)^2)
    return(Sq)
}

b = c(1,1,1)
optim(b,myfunc2,dd=dd);
ans = b^2/sum(b^2)

this would work for more than 3 variables as well. let me know if this makes sense and if you have additional questions.

Ramnath
@Ramnath: Turned out that my first objections -based on theoretical consideration- were quite off. Marek corrected your code, but you did put me on the right track. +1 for that, a thank you and an apology for my initial comment. PS: I had to "edit" your answer to be able to cast a vote. I added one space, so nothing changed.
Joris Meys
i assumed that your parameters were all positive. marek's modification allows both signs, which i think works better for your case.
Ramnath
@Ramnath It has nothing to do with signs. Your normalization is wrong, `alpha^2` don't sum to 1. Check for example `alpha=c(0.5,0.5)`, after your transformation you got `c(0.5,0.5)` which sums of square is `0.5`.
Marek
@ Markek ah. i see. i added a square to alpha. my bad. thanks for picking that up
Ramnath
A: 

It may be a bit trickier than you want, and I don't have the time to work out the details at the moment, but I think you can still do this. Suppose you bound all parameters between 0 and 1 (you can do this with L-BFGS-B) and map the optim() parameters p and your real parameters p' as follows:

p_1' = p_1
p_2' = sqrt(p_2*(1-p_1'^2))
p_3' = sqrt(p_3*(1-(p_1^2+p_2^2))
...
p_n' = 1-sqrt(sum(p_i^2))

or something a bit like that.

Ben Bolker
@Ben I've been fiddling with those things too, but they don't scale well to larger problems.
Joris Meys
+1  A: 

I think that Ramnath answer isn't bad, but he make some mistake. The alpha correction should be modified.

This is improved version:

myfunc2 <- function(alpha,dd){
    alpha <- alpha/sqrt(sum(alpha^2)) # here the modification ;)
    X <- as.matrix(dd[,-4]) %*% alpha
    m.mat <- model.matrix(~X)
    mod <- glm.fit(m.mat,dd$Y)
    Sq <- sum(resid(mod)^2)
    return(Sq)
}

b = c(1,1,1)
( x <- optim(b, myfunc2, dd=dd)$par )
( final_par <- x/sqrt(sum(x^2)) )

I got similar results to your unrestricted version.


[EDIT]

Actualy this won't work correctly if start point is wrong. E.g

x <- optim(-c(1,1,1), myfunc2, dd=dd)$par
( final_par <- x/sqrt(sum(x^2)) )
# [1] -0.5925  0.5620 -0.5771

It gives negate of true estimate because mod <- glm.fit(m.mat,dd$Y) estimates negative coefficient of X.

I think that this glm re-estimate isn't quite correct. I think you should estimate intercept as mean of residuals Y-X*alpha.

Something like:

f_err_1 <- function(alpha,dd) {
    alpha <- alpha/sqrt(sum(alpha^2))
    X <- as.matrix(dd[,-4]) %*% alpha
    a0 <- mean(dd$Y-X)
    Sq <- sum((dd$Y-a0-X)^2)
    return(Sq)
}

x <- optim(c(1,1,1), f_err_1, dd=dd)$par;( final_par <- x/sqrt(sum(x^2)) )
# [1] 0.5924 -0.5620  0.5772
x <- optim(-c(1,1,1), f_err_1, dd=dd)$par;( final_par <- x/sqrt(sum(x^2)) )
# [1]  0.5924 -0.5621  0.5772
Marek
That's a coincidence. I just came to the same result at the same time. Thx for the confirmation
Joris Meys
I'm standing behind you. Booo! ;)
Marek
I guess I'm a little surprised this works, since it is fitting n parameters with only n-1 degrees of freedom. Maybe that's only a difficulty for statistics, not for optimization per se. @Joris, just out of curiosity, what about my attempted solution below doesn't scale well?
Ben Bolker
@Ben : the number of parameters varies, so I'd have to find a way of recursively workING my way through the parameters to reparametrize. In this simple case, that's doable, but with up to 40 parameters and a far more complex optimization function, the overhead of this just gets too big. R and recursive are afaik not the best friends... Btw, I'm as surprised as you this works. Initially I had the same reflex on the df, but I believe the df only start to matter when doing the inference, not the optimization.
Joris Meys
Don't think it would be that hard ... special case/variant of 'additive log-ratio' transformation <http://rss.acs.unt.edu/Rdoc/library/compositions/html/alr.html>, but you can do it without the log part (if you want to allow parameters on the boundary/can use an optimizer that allows 0<x<1 box constraints on the individual parameters). That constrains the sum to be 1 rather than the sum of squares, and constrains 0<x<1 rather -1<x<1 (which you would need for sum of squares == 1) -- don't know if allowing negative params is important to you. Not that imp't since you have a working solution?
Ben Bolker