What's the most efficient way to create a moving average or rolling sum in R? How do you do the rolling function along with a "group by"?
I scratched up a good answer from Achim Zeileis over on the r list. Here's what he said:
library(zoo)
## create data
x <- rnorm(365)
## transform to regular zoo series with "Date" index
x <- zooreg(x, start = as.Date("2004-01-01")) plot(x)
## add rolling/running/moving average with window size 7
lines(rollmean(x, 7), col = 2, lwd = 2)
## if you don't want the rolling mean but rather a weekly ## time series of means you can do
nextfri <- function(x) 7 * ceiling(as.numeric(x - 1)/7) + as.Date(1) xw <- aggregate(x, nextfri, mean)
## nextfri is a function which computes for a certain "Date" ## the next friday. xw is then the weekly series.
lines(xw, col = 4)
Achim went on to say:
Note, that the difference between is rolling mean and the aggregated series is due to different alignments. This can be changed by changing the 'align' argument in
rollmean()
or thenextfri()
function in the aggregate call.
All this came from Achim, not from me: http://tolstoy.newcastle.edu.au/R/help/05/06/6785.html
While zoo is great, sometimes there are simpler ways. If you data behaves nicely, and is evenly spaced, the embed() function effectively lets you create multiple lagged version of a time series. If you look inside the VARS package for vector auto-regression, you will see that the package author chooses this route.
For example, to calculate the 3 period rolling average of x, where x = (1 -> 20)^2:
> x <- (1:20)^2
> embed (x, 3)
[,1] [,2] [,3]
[1,] 9 4 1
[2,] 16 9 4
[3,] 25 16 9
[4,] 36 25 16
[5,] 49 36 25
[6,] 64 49 36
[7,] 81 64 49
[8,] 100 81 64
[9,] 121 100 81
[10,] 144 121 100
[11,] 169 144 121
[12,] 196 169 144
[13,] 225 196 169
[14,] 256 225 196
[15,] 289 256 225
[16,] 324 289 256
[17,] 361 324 289
[18,] 400 361 324
> apply (embed (x, 3), 1, mean)
[1] 4.666667 9.666667 16.666667 25.666667 36.666667 49.666667
[7] 64.666667 81.666667 100.666667 121.666667 144.666667 169.666667
[13] 196.666667 225.666667 256.666667 289.666667 324.666667 361.666667