tags:

views:

257

answers:

1

Formulas are a very useful feature of R's statistical and graphical functions. Like everyone, I am a user of these functions. However, I have never written a function that takes a formula object as an argument. I was wondering if someone could help me, by either linking to a readable introduction to this side of R programming, or by giving a self-contained example.

+2  A: 

You can use model.matrix() and model.frame() to evaluate the formula:

lm1 <- lm(log(Volume) ~ log(Girth) + log(Height), data=trees)
print(lm1)

form <- log(Volume) ~ log(Girth) + log(Height)

# use model.matrix
mm <- model.matrix(form, trees)
lm2 <- lm.fit(as.matrix(mm), log(trees[,"Volume"]))
print(coefficients(lm2))

# use model.frame, need to add intercept by hand
mf <- model.frame(form, trees)
lm3 <- lm.fit(as.matrix(data.frame("Intercept"=1, mf[,-1])), mf[,1])
print(coefficients(lm3))

which yields

Call: lm(formula = log(Volume) ~ log(Girth) + log(Height), data = trees)

Coefficients: (Intercept)   log(Girth) log(Height)
      -6.63         1.98         1.12

(Intercept)  log(Girth) log(Height)
     -6.632       1.983       1.117  
Intercept  log.Girth. log.Height.
     -6.632       1.983       1.117
Dirk Eddelbuettel
Thanks, very interesting. I understand also why glmnet or ther packages may not offer this capability: it uses sparse matrix in the package Matrix, which may not be treated with model.matrix().
gappy