tags:

views:

105

answers:

4

Hi all,

Will most likely expose that I am new to R, but in SPSS, running lags is very easy. Obviously this is user error, but what I am missing?

x <- sample(c(1:9), 10, replace = T)
y <- lag(x, 1)
ds <- cbind(x, y)
ds

Results in:

      x y
 [1,] 4 4
 [2,] 6 6
 [3,] 3 3
 [4,] 4 4
 [5,] 3 3
 [6,] 5 5
 [7,] 8 8
 [8,] 9 9
 [9,] 3 3
[10,] 7 7

I figured I would see:

     x y
 [1,] 4 
 [2,] 6 4
 [3,] 3 6
 [4,] 4 3
 [5,] 3 4
 [6,] 5 3
 [7,] 8 5
 [8,] 9 8
 [9,] 3 9
[10,] 7 3

Any guidance will be much appreciated.

+4  A: 

lag does not shift the data, it only shifts the "time-base". x has no "time base", so cbind does not work as you expected. Try cbind(as.ts(x),lag(x)) and notice that a "lag" of 1 shifts the periods forward.

I would suggesting using zoo / xts for time series. The zoo vignettes are particularly helpful.

Joshua Ulrich
Neither `zoo` nor `xts` seems to be stock, where do I get them?
Zack
`install.packages("xts") # this will install zoo as well`
Joshua Ulrich
+1  A: 

lag() works with time series, whereas you are trying to use bare matrices. This old question suggests using embed instead, like so:

lagmatrix <- function(x,max.lag) embed(c(rep(NA,max.lag), x), max.lag+1)

for instance

> x
[1] 8 2 3 9 8 5 6 8 5 8
> lagmatrix(x, 1)
      [,1] [,2]
 [1,]    8   NA
 [2,]    2    8
 [3,]    3    2
 [4,]    9    3
 [5,]    8    9
 [6,]    5    8
 [7,]    6    5
 [8,]    8    6
 [9,]    5    8
[10,]    8    5
Zack
+2  A: 

Another way to deal with this is using the zoo package, which has a lag method that will pad the result with NA:

require(zoo)
> set.seed(123)
> x <- zoo(sample(c(1:9), 10, replace = T))
> y <- lag(x, -1, na.pad = TRUE)
> cbind(x, y)
   x  y
1  3 NA
2  8  3
3  4  8
4  8  4
5  9  8
6  1  9
7  5  1
8  9  5
9  5  9
10 5  5

The result is a multivariate zoo object (which is an enhanced matrix), but easily converted to a data.frame via

> data.frame(cbind(x, y))
Gavin Simpson
Also note that if z is a zoo series then lag(z, 0:-1) is a two column zoo series with the original series and a lagged series. Also, coredata(z) will return just the data part of a zoo series and as.data.frame(z) will return a data frame with the data part of z as the column contents.
G. Grothendieck
+1  A: 

Just get rid of lag. Change your line for y to:
y<-c(NA,x[-1])

frankc