tags:

views:

219

answers:

4

Hello,

I have decided to learn R. I am trying to get a sense of how to write "R style" functions and to avoid looping. Here is a sample situation:

Given a vector a, I would like to compute a vector b whose elements b[i] (the vector index begins at 1) are defined as follows:

1 <= i <= 4:
b[i] = NaN

5 <= i <= length(a):
b[i] = mean(a[i-4] to a[i])

Essentially, if we pretend 'a' is a list of speeds where the first entry is at time = 0, the second at time = 1 second, the third at time = 2 seconds... I would like to obtain a corresponding vector describing the average speed over the past 5 seconds.

E.g.: If a is (1,1,1,1,1,4,6,3,6,8,9) then b should be (NaN, NaN, NaN, NaN, 1, 1.6, 2.6, 3, 4, 5.4, 6.4)

I could do this using a loop, but I feel that doing so would not be in "R style".

Thank you,

Tungata

+1  A: 

Something like b = filter(a, rep(1.0/5, 5), sides=1) will do the job, although you will probably get zeros in the first few slots, instead of NaN. R has a large library of built-in functions, and "R style" is to use those wherever possible. Take a look at the documentation for the filter function.

Ian Ross
Yes, that is probably the simplest possible moving-average implementation
Dirk Eddelbuettel
+2  A: 

Because these rolling functions often apply with time-series data, some of the newer and richer time-series data-handling packages already do that for you:

R> library(zoo)   ## load zoo
R> speed <- c(1,1,1,1,1,4,6,3,6,8,9)
R> zsp <- zoo( speed, order.by=1:length(speed) )  ## creates a zoo object
R> rollmean(zsp, 5)                               ## default use
  3   4   5   6   7   8   9 
1.0 1.6 2.6 3.0 4.0 5.4 6.4 
R> rollmean(zsp, 5, na.pad=TRUE, align="right")   ## with padding and aligned
  1   2   3   4   5   6   7   8   9  10  11 
 NA  NA  NA  NA 1.0 1.6 2.6 3.0 4.0 5.4 6.4 
R>

The zoo has excellent documentation that will show you many, many more examples, in particular how to do this with real (and possibly irregular) dates; xts extends this further but zoo is a better starting point.

Dirk Eddelbuettel
+1  A: 

You can also use a combination of cumsum and diff to get the sum over sliding windows. You'll need to pad with your own NaN, though:

> speed <- c(1,1,1,1,1,4,6,3,6,8,9)
> diff(cumsum(c(0,speed)), 5)/5
[1] 1.0 1.6 2.6 3.0 4.0 5.4 6.4
Steve Lianoglou
A: 

This doesn't answer your eventual question, but to avoid loops in R as your title suggests, check out this page on the R Project wiki.

fideli