views:

136

answers:

2

Hi,

I am just starting on R - and have hit a bit of a deadlock with some time series data.

I have a time series (date and value) in 'zoo' format, that I want to annotate with a cross when an event occurs. The events are irregular and in a csv format (just the dates, sometimes repeated).

I've managed to read in the dates etc into a format that R accepts - but i cant seem to get a means to chart the main time series with the secondary events annotated on top?

Update: Sorry I missed this out earlier - below is the sort of data i am working with:

price <- get.hist.quote(instrument = "msft", quote = c("Cl", "Vol"))

I now want to compare number of tweets (for a search term) against this, but i only have spotty data of the form:

"February 28, 2010"
"February 20, 2010"
"February 20, 2010"
"August 21, 2009"

Some are repeated. So far i've managed to write up a python script to do some cleaning (i.e. a tuple of date, occurrences), but i was hoping I could just work with the raw data using R ?

Many thanks

+2  A: 

Providing a data sample would get you a more precise answer, but you have two general options:

Using the existing plot.zoo() function, you can add annotations after the plot is finished using (for instance) the text() function. Or using ggplot2, you can take a similar approach of creating the plot and adding the annotations (although it doesn't natively accept zoo objects as input).

Alternatively, chartSeries in quantmod has many functions designed with this purpose in mind, and it accepts zoo as input.

Edit:

One quick comment about how to deal with the data that you posted in your question. The second set of dates should be converted into a zoo object (possibly with some kind of signifier as the data, such as the word "tweet"), and then merged with the original series. So you will have an additional column in your time series that represents these infrequent events. In most cases, this column will be NA.

Shane
Good point - updated the question - would plot.zoo() still be the way to go?
flyingcrab
Thanks Shane - worked like a charm!
flyingcrab
@flyingcrab - Great! Would love to see the final output if you are able to post it.
Shane
alas - nothing exciting came out... but ill keep at it :)
flyingcrab
+1  A: 

Create zoo series, price and dd.zoo (where dd.zoo is the number of occurrences of each date in dd.character as a zoo object) and then just bind them together and use plot.zoo:

library(zoo)
library(tseries)

price <- get.hist.quote(instrument = "msft", quote = c("Cl", "Vol"))

dd.character <- c("February 28, 2010", "February 20, 2010",
    "February 20, 2010", "August 21, 2009")
dd.Date <- as.Date(dd.character, "%B %d, %Y")
dd.zoo <- aggregate(zoo(dd.Date), dd.Date, length)

plot(cbind(price$Close, dd.zoo), type = c("l", "h"), heights = c(3, 1))

Another possibility if you just want to show the Close overlayed with vertical lines illustrating which dates are in dd.character is:

plot(price$Close)
abline(v = time(dd.zoo))
G. Grothendieck