tags:

views:

43

answers:

1

I have a data.frame which has multiple columns. One of the columns is time and is thus non-decreasing. Rest of the columns contain observations recorded at the time given by the time specified in a certain row of the data.frame.

I want to select a window of time, say "x" seconds, and calculate the average (or for that matter any function) of the entries in some other columns in the same data.frame for that window.

Of course, because its a time based average, the number of entries in a window can vary depending upon the data. This is because the number of rows belonging to a certain time window can vary.

I have done this using a custom function, which creates a new column in the data.frame. The new column assigns a single number to all the entries in a time window. The number is unique across all the time windows. This essentially divides the data into groups based on the time windows. Then I use R's "aggregate" function to do calculate the mean.

I was just wondering if there is an existing R function that can do the grouping based on a time interval or if there is a better (cleaner) way to do this.

A: 

Assuming your data.frame contains only numeric data, this is one way to do it using zoo/xts:

> Data <- data.frame(Time=Sys.time()+1:20,x=rnorm(20))
> xData <- xts(Data[,-1], Data[,1])
> period.apply(xData, endpoints(xData, "seconds", 5), colMeans)
                           [,1]
2010-10-20 13:34:19 -0.20725660
2010-10-20 13:34:24 -0.01219346
2010-10-20 13:34:29 -0.70717312
2010-10-20 13:34:34  0.09338097
2010-10-20 13:34:38 -0.22330363

EDIT: using only base R packages. The means are the same, but the times are slightly different because endpoints starts the 5-second interval with the first observation. The code below groups on 5-second intervals starting with seconds = 0.

> nSeconds <- 5
> agg <- aggregate(Data[,-1], by=list(as.numeric(Data$Time) %/% nSeconds), mean)
> agg[,1] <- .POSIXct(agg[,1]*nSeconds)  # >= R-2.12.0 required for .POSIXct
Joshua Ulrich
Thanks for your answer, it is correct but I am still interested in how others would solve it using R
nixbox
But this _is_ using R... or are you looking for a solution that only uses base R packages?
Joshua Ulrich
yes something the uses base R or atleast preserves the data.frame type, although I checked that I can use as.data.frame on the xts to convert it back, the only thing is I would have to explicitly add another column using row.names (time information) in order to create graphs with ggplot2
nixbox
Thanks for the solution, exactly what I was looking for
nixbox