tags:

views:

142

answers:

2

With the linear model function lm() polynomial formulas can contain a shortcut notation like this:

m <- lm(y ~ poly(x,3))

this is a shortcut that keeps the user from having to create x^2 and x^3 variables or typing them in the formula like I(x^2) + I(x^3). Is there comparable notation for the nonlinear function nls()?

+3  A: 

Short answer: yes.

Slightly longer answer: It is pretty cheap to test this. I just ran example(nls) to get a model and data loaded and then inserted a term with poly().

Even longer answer: lm() doesn't actually know about poly(), the formulae get resolved before the fitting happens. So in the sense that nls() has a formula interface ... it was bound to accept poly().

Off-topic and not asked for: Did you look into splines as well as per Harrell's RMS book?

Dirk Eddelbuettel
I had tested poly() and was having trouble with it. After examining the examples in example(nls) I see I had a syntax issue related to my coefficients. I thought it was barfing on `poly` when it was really barfing on my syntactical issues. WRT splines and RMS, I really like splines quite a lot. And I have muddled through much of RMS. In this situation, however, I am trying to recreate a hot mess someone put together using some regression add in for Excel.
JD Long
Ah, so you're filming Mission Impossible III ? ;-)
Dirk Eddelbuettel
I think the working title is Dumb and Dumber
JD Long
+4  A: 

poly(x, 3) is rather more than just a shortcut for x + I(x ^ 2) + I(x ^ 3)- it actually produces legendre polynomials which have the nice property of being uncorrelated:

options(digits = 2)
x <- runif(100)
var(cbind(x, x ^ 2, x ^ 3))
#       x            
# x 0.074 0.073 0.064
#   0.073 0.077 0.071
#   0.064 0.071 0.067
zapsmall(var(poly(x, 3)))
#      1    2    3
# 1 0.01 0.00 0.00
# 2 0.00 0.01 0.00
# 3 0.00 0.00 0.01
hadley
+1 for orthogonality pointer. Does the interpretation of the parameter estimates remain the same in two cases given above. I know that the fitted values will remain the same in a linear model as in both cases they span the same space. But what about the case when used in a nonlinear least square framework?
Laplace... Taylor series... oh shit I need to get my dusty copy of Fundamental Methods of Mathematical Economics.
JD Long