tags:

views:

77

answers:

1

I have a timeseries of samples in R:

> str(d)
 'data.frame': 5 obs. of  3 variables:
 $ date: POSIXct, format: "2010-03-04 20:47:00" "2010-03-04 21:47:00" ...
 $ x   : num  0 10 11 15.2 20
 $ y   : num  0 5 7.5 8.4 12.5
> d
                 date    x    y
1 2010-03-04 20:47:00  0.0  0.0
2 2010-03-04 21:47:00 10.0  5.0
3 2010-03-04 22:47:00 11.0  7.5
4 2010-03-04 23:47:00 15.2  8.4
5 2010-03-05 00:47:00 20.0 12.5

In this example samples for x and y are taken every hour (but the time delta is not fix). The x and y values are always growing (like a milage counter in a car). I need the deltas, how much was the growth in between, something like this:

1 2010-03-04 20:47:00  0.0  0.0
2 2010-03-04 21:47:00 10.0  5.0
3 2010-03-04 22:47:00 1.0   2.5
4 2010-03-04 23:47:00 4.2   0.9
5 2010-03-05 00:47:00 4.8   4.1

And I also need the deltas per time (x and y delta, divided by the time - delta per hour)

How would I do this in R?

+2  A: 

Just use diff() once switched to a time-aware data structure like zoo:

> library(zoo)
> DF <- data.frame(date=Sys.time() + 0:4*3600, x = cumsum(runif(5)*10), 
                                               y=cumsum(runif(5)*20))
> DF
                 date       x      y
1 2010-04-09 15:14:54  9.6282 14.709
2 2010-04-09 16:14:54 12.4041 28.665
3 2010-04-09 17:14:54 18.1643 34.244
4 2010-04-09 18:14:54 27.5785 41.028
5 2010-04-09 19:14:54 33.2779 57.020
> zdf <- zoo(DF[,-1], order.by=DF[,1])
> diff(zdf)
                         x       y
2010-04-09 16:14:54 2.7759 13.9556
2010-04-09 17:14:54 5.7602  5.5792
2010-04-09 18:14:54 9.4142  6.7844
2010-04-09 19:14:54 5.6995 15.9919
> 

You can easily pad the first row back, merge, ... etc -- see the excellent documentation for package zoo for details.

Dirk Eddelbuettel