tags:

views:

225

answers:

3

Suppose I have a list as follows

bar=c()
bar["1997-10-14"]=1
bar["2001-10-14"]=2
bar["2007-10-14"]=1

How can I select from this list all values for which the index is within a specific date range? So, if I look for all values between "1995-01-01" and "2000-06-01", I should get 1. And similarly for the period "2001-09-01" and "2007-11-04", I should get 2 an 1.

+1  A: 

You need to convert your dates from characters into a Date type with as.Date() (or a POSIX type if you have more information like the time of day). Then you can make comparisons with standard relational operators such as <= and >=.

You should consider using a timeseries package such as zoo for this.

Edit:

Just to respond to your comment, here's an example of using dates with your existing vector:

> as.Date(names(bar)) < as.Date("2001-10-14")
[1]  TRUE FALSE FALSE
> bar[as.Date(names(bar)) < as.Date("2001-10-14")]
1997-10-14 
         1

Although you really should just use a time series package. Here's how you could do this with zoo (or xts, timeSeries, fts, etc.):

library(zoo)
ts <- zoo(c(1, 2, 1), as.Date(c("1997-10-14", "2001-10-14", "2007-10-14")))
ts[index(ts) < as.Date("2001-10-14"),]

Since the index is now a Date type, you can make as many comparisons as you want. Read the zoo vignette for more information.

Shane
Hmmm, I might be doing it wrong but if I do something like bar[as.Date("2001-10-14")] I get very strange results involving a lot of NAs
Pieter
A: 

Using fact that dates are in lexical order:

bar[names(bar) > "1995-01-01" & names(bar) < "2000-06-01"]
# 1997-10-14 
#          1 

bar[names(bar) > "2001-09-01" & names(bar) < "2007-11-04"]
# 2001-10-14 2007-10-14 
#          2          1 

Result is named vector (as you original bar, it's not a list it's named vector).

As Dirk states in his answer it's better to use Date for efficiency reasons. Without external packages you could rearrange you data and create two vectors (or two-column data.frame) one for dates, one for values:

bar_dates <- as.Date(c("1997-10-14", "2001-10-14", "2007-10-14"))
bar_values <- c(1,2,1)

then use simple indexing:

bar_values[bar_dates > as.Date("1995-01-01") & bar_dates < as.Date("2000-06-01")]
# [1] 1

bar_values[bar_dates > as.Date("2001-09-01") & bar_dates < as.Date("2007-11-04")]
# [1] 2 1
Marek
+2  A: 

This problem has been solved for good with the xts package which extends functionality from the zoo package.

R> library(xts)
Loading required package: zoo
R> bar <- xts(1:3, order.by=as.Date("2001-01-01")+365*0:2)
R> bar
           [,1]
2001-01-01    1
2002-01-01    2
2003-01-01    3
R> bar["2002::"]        ## open range with a start year
           [,1]
2002-01-01    2
2003-01-01    3
R> bar["::2002"]        ## or end year
           [,1]
2001-01-01    1
2002-01-01    2
R> bar["2002-01-01"]    ## or hits a particular date
           [,1]
2002-01-01    2
R> 

There is a lot more here -- but the basic point is do not operate on strings masquerading as dates.

Use a Date type, or preferably even an extension package built to efficiently index on millions of dates.

Dirk Eddelbuettel