ansaurus

Question

working with sequences of different length

Answer 1

+2 A:

data frames are lists. Suppose the distance between time stamps is in the vector "x" in list/data.frame y. you could do sort(-table(y[["x"]]))[1] to get the mode.

Eduardo Leoni 2009-12-07 16:45:19

my data contains only timestamps. that is: all columns contain timestamps and I want to examine all columns.

mariotomo 2009-12-08 08:15:31

Answer 2

+2 A:

The best way to approach this is probably to use an irregular time series object (see the time series view on CRAN). You have several good options (e.g. timeSeries, its, fts, xts), but the most popular of these is the zoo package. You can create a time series like so:

library(zoo)
x.Date <- as.Date("2003-02-01") + c(1, 3, 7, 9, 14) - 1
x <- zoo(rnorm(5), x.Date)

Then, to see the difference in time between each event, you can just use the diff function to create a difftime object:

> diff(index(x))
Time differences in days
[1] 2 4 2 5

You can analyze these time difference just like you would any other variable, for instance:

> summary(diff(index(x)))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   2.00    2.00    3.00    3.25    4.25    5.00

Similarly, to find the most common time difference, you can use any other standard approach such as table():

> table(diff(index(x)))
2 4 5 
2 1 1

Shane 2009-12-08 00:23:18

I'm afraid my problem is that I am not (yet) confident with the "any other case" and with the "other standard approach[es]".

mariotomo 2009-12-08 08:13:11

Answer 3

A:

I think I would settle with this one (works if the most common step really occurs more often than in 50% of the cases).

mostCommonStep <- function(L) {
  ## returns the value of the most common difference between
  ## subsequent elements.

  ## takes into account only forward steps, all negative steps are
  ## discarded.  works with list, data.frame, matrix.
  L <- diff(unlist(sapply(as.list(L), as.numeric)))
  as.numeric(quantile(L[L>0], 0.5))
}

mariotomo 2009-12-09 07:57:22

ansaurus

tags:

views:

answers:

working with sequences of different length

related questions