Formulas are a very useful feature of R's statistical and graphical functions. Like everyone, I am a user of these functions. However, I have never written a function that takes a formula object as an argument. I was wondering if someone could help me, by either linking to a readable introduction to this side of R programming, or by giving a self-contained example.
+2
A:
You can use model.matrix()
and model.frame()
to evaluate the formula:
lm1 <- lm(log(Volume) ~ log(Girth) + log(Height), data=trees)
print(lm1)
form <- log(Volume) ~ log(Girth) + log(Height)
# use model.matrix
mm <- model.matrix(form, trees)
lm2 <- lm.fit(as.matrix(mm), log(trees[,"Volume"]))
print(coefficients(lm2))
# use model.frame, need to add intercept by hand
mf <- model.frame(form, trees)
lm3 <- lm.fit(as.matrix(data.frame("Intercept"=1, mf[,-1])), mf[,1])
print(coefficients(lm3))
which yields
Call: lm(formula = log(Volume) ~ log(Girth) + log(Height), data = trees)
Coefficients: (Intercept) log(Girth) log(Height)
-6.63 1.98 1.12
(Intercept) log(Girth) log(Height)
-6.632 1.983 1.117
Intercept log.Girth. log.Height.
-6.632 1.983 1.117
Dirk Eddelbuettel
2009-08-19 16:00:11
Thanks, very interesting. I understand also why glmnet or ther packages may not offer this capability: it uses sparse matrix in the package Matrix, which may not be treated with model.matrix().
gappy
2009-08-19 18:24:20