ansaurus

Question

large-scale regression in R with a sparse feature matrix

Answer 1

+3 A:

Don't know about SparseM but the Matrix package has an unexported lm.fit.sparse function that you can use. See vignette("sparseModels",package="Matrix"). Here is an example:

Create the data:

> y<-rnorm(30)
> x<-factor(sample(letters,30,replace=TRUE))
> X<-as(x,"sparseMatrix")
> class(X)
[1] "dgCMatrix"
attr(,"package")
[1] "Matrix"
> dim(X)
[1] 18 30

Run the regression:

> Matrix:::lm.fit.sparse(t(X),y)
 [1] -0.17499968 -0.89293312 -0.43585172  0.17233007 -0.11899582  0.56610302
 [7]  1.19654666 -1.66783581 -0.28511569 -0.11859264 -0.04037503  0.04826549
[13] -0.06039113 -0.46127034 -1.22106064 -0.48729092 -0.28524498  1.81681527

For comparison:

> lm(y~x-1)

Call:
lm(formula = y ~ x - 1)

Coefficients:
      xa        xb        xd        xe        xf        xg        xh        xj  
-0.17500  -0.89293  -0.43585   0.17233  -0.11900   0.56610   1.19655  -1.66784  
      xm        xq        xr        xt        xu        xv        xw        xx  
-0.28512  -0.11859  -0.04038   0.04827  -0.06039  -0.46127  -1.22106  -0.48729  
      xy        xz  
-0.28524   1.81682

Jyotirmoy Bhattacharya 2010-07-03 05:18:36

Answer 2

+1 A:

You might also get some mileage by looking here:

The biglm package.
The High Performance and Parallel Computing R task view.
A paper about Sparse Model Matrices for Generalized Linear Models (PDF), by Martin Machler and Douglas Bates from UseR 2010.

Steve Lianoglou 2010-07-03 20:37:35

ansaurus

tags:

views:

answers:

large-scale regression in R with a sparse feature matrix

related questions