views:

278

answers:

1

Let's say I have a data matrix d

pc = prcomp(d)

# pc1 and pc2 are the principal components  
pc1 = pc$rotation[,1] 
pc2 = pc$rotation[,2]

Then this should fit the linear regression model right?

r = lm(y ~ pc1+pc2)

But then I get this error :

Errormodel.frame.default(formula = y ~ pc1+pc2, drop.unused.levels = TRUE) : 
   unequal dimensions('pc1')

I guess there a packages out there who do this automatically, but this should work too?

+3  A: 

Answer: you don't want pc$rotation, it's the rotation matrix and not the matrix of rotated values (scores).

Make up some data:

> x1 = runif(100)
> x2 = runif(100)
> y = rnorm(2+3*x1+4*x2)
> d = cbind(x1,x2)

> pc = prcomp(d)
> dim(pc$rotation)
[1] 2 2

Oops. The "x" component is what we want. From ?prcomp: "x: if ‘retx’ is true the value of the rotated data (the centred (and scaled if requested) data multiplied by the ‘rotation' matrix) is returned."

> dim(pc$x)
[1] 100   2
> lm(y~pc$x[,1]+pc$x[,2])

Call:
lm(formula = y ~ pc$x[, 1] + pc$x[, 2])

Coefficients:
(Intercept)    pc$x[, 1]    pc$x[, 2]  
    0.04942      0.14272     -0.13557
Ben Bolker
hey, this seems to work , but I tried this in R > pcStandard deviations:[1] 0.3068542 0.2650774Rotation: PC1 PC2x1 -0.5518651 0.8339334x2 -0.8339334 -0.5518651x1[1] is the first element of x1> x1[1] [1] 0.69602246 > x2[1] [1] 0.268991455 Then the first element of pc$x[,1] must be the first element of the first pricipal component right ? So shouldnt pc$x[1,1] = pc$rotation[1,1]*x1[1]+pc$rotation[1,2]*x2[1] ? > pc$rotation[1,1]*x1[1]+pc$rotation[1,2]*x2[1][1] -0.1597895but actually pc$x[1,1] is : > pc$x[1,1][1] 0.08993233
phpdash
This looks terrible , isnt there a way to separate the lines ?
phpdash
No - read up on matrix multiplication.
hadley