views:

85

answers:

3

Greetings,

I have a table that looks like the following:

      date value
2007-11-05 134
2007-12-08 234
2008-03-10 322
2008-03-11 123
...

In summary, it has daily values for three years, but it doesn't have values for every day. What I need is to draw a line chart plot(data$date, data$value) for the whole time span, but considering that, for those days the table doesn't specify a value, it should assume the last known. In other words, the table only has values in the days that it changed.

Any R experts out there that can give me an hand? :-)

Thanks in advance!

+1  A: 

Something like this?

require(zoo)
data = data.frame(date = as.Date(c('2007-11-05', '2007-12-08', '2008-03-10', '2008-03-11')), value = c(134, 234, 322, 123))
data = zoo(data$value, data$date)
days = seq(start(data), end(data), "day")
data2 = na.locf(merge(data, zoo(,days)))
plot(data2)
Curt Hagenlocher
Hmmm... "An S3 class with methods for totally ordered indexed observations. It is particularly aimed at irregular time series of numeric vectors/matrices and factors. zoo's key design goals are independence of a particular index/date/time class and consistency with ts and base R by providing methods to extend standard generics."
Hugo S Ferreira
Great :-) I am having one problem though. My original data has repeated values for a single day, and I wish to only consider the last one (not to mention the error 'series cannot be merged with non-unique index entries in a series'). Could you give me an hint on this one too?
Hugo S Ferreira
I think aggregate.zoo will do this. See the zoo FAQ at http://cran.r-project.org/web/packages/zoo/vignettes/zoo-faq.pdf
Curt Hagenlocher
+1  A: 

Hugo, are all the repeat values for a single day the same value or different? If the same, you could use sqldf package to select distinct date and value and plot. If different, you could plot using ggplot's geom_step type for a step chart, and will show the range for the same x axis value. See code example below, I added two values for 1/15/2008.

data = data.frame(date = as.Date(c('2007-11-05', '2007-12-08', '2008-03-10',  
                                   '2008-03-11', '2008-01-15', '2008-01-15')),  
                                  value = c(134, 234, 322, 123, 175, 275))
ggplot(data, aes(x = date, y = value)) + geom_step()

If the multiple values for the day are the same, then ggplot will just see them as one.

wahalulu
A: 

Try this. We read in the data aggregating using tail(x, 1) to get the last of any day and then we plot it. (The read.zoo line keeps the example self contained but in reality would be replaced with something like the commented out line.)

Lines <- "date value
2007-11-05 132
2007-11-05 134
2007-12-08 231
2007-12-08 234
2008-03-10 322
2008-03-11 123"

library(zoo)

# z <- read.zoo("myfile.dat", header = TRUE, aggregate = function(x) tail(x, 1))

z <- read.zoo(textConnection(Lines), header = TRUE, aggregate = function(x) tail(x, 1))
plot(z)
G. Grothendieck