tags:

views:

352

answers:

4

This is a newbie question in R. I am downloading yahoo finance monthly stock price data using R where the ticker names are read from a text file . I am using a loop to read the ticker names to download the data and putting them in a list. My problem is some ticker names may not be correct thus my code stops when it encounters this case.I want the following.

(1) skip the ticker name if it is not correct. (2) Each element in the list is a dataframe. I want the ticker names to be appended to variable names in element dataframes. (3)I need an efficient way to create a dataframe that has the closing prices as variables.

Here is the sample code for the simplified version of my problem.

library(tseries)  
tckk <-c("MSFT","C","VIA/B","MMM")# ticker names defined  
numtk <-length(tckk);  
ustart <-"2000-12-30";uend <-"2007-12-30" #start and end date  
all_dat<-list()#empty list to fill in the data  
for(i in 1:numtk)  
{  
  all_dat[[i]]<-xxx<-get.hist.quote(instrument = tckk[i], start=ustart, end=uend,quote = c("Open", "High", "Low", "Close"),provider = "yahoo", compression = "m")  

}   

The code stops at the third entry but I want to skip this ticker and move on to "MMM".I have heard about Trycatch() function but donot know how to use it.
As per question 2, I want the variable names for the first element of the list to be "MSFTopen","MSFThigh","MSFTlow", and "MSFTclose" . IS there a better to way to do it apart from using a combination of loop and paste() function.
Finally, for question 3, I need a dataframe with three columns corresponding to closing prices. Again, I am trying to avoid a loop here. Thank you.

+7  A: 

Your best bet is to use quantmod and store the results as a time series (in this case, it will be xts):

library(quantmod)
symbols <- c("MSFT","C","VIA/B","MMM")

#1
l_ply(symbols, function(sym) try(getSymbols(sym))) 
symbols <- symbols[symbols %in% ls()]

#2
sym.list <- llply(symbols, get) 

#3
data <- xts()
for(i in seq_along(symbols)) {
    symbol <- symbols[i]
    data <- merge(data, get(symbol)[,paste(symbol, "Close", sep=".")])
}
Shane
Thank you for the code. It works. However, originally I am reading the ticker names from a file that has more than 1000 tickers. So hard coding with the merge() may not help me. Moreover, I do want to put them in a list so that I can use plyr library to do other stuff to each list element.
Ok, well it's easy to do both of those things. Just loop over the symbols to merge them (rather than hard coding them). And store then in a list with plyr: `llply(symbol.names, get)`.
Shane
Updated with all your points.
Shane
Wow! this is great. Thank you so much Shane.
One note: I much prefer `auto.assign=FALSE` in `getSymbols()` as it avoids the insanity of filling the environment with all those symbols when all I want is the merged `data.frame`.
Dirk Eddelbuettel
You could further clean up the merge with: `merge(data, Cl(get(symbol)))`
Joshua Ulrich
+2  A: 

I'm a little late to the party, but I think this will be very helpful to other late comers.

The stockSymbols function in TTR fetches instrument symbols from nasdaq.com, and adjusts the symbols to be compatible with Yahoo! Finance. It currently returns ~6,500 symbols for AMEX, NYSE, and NASDAQ. You could also take a look at the code in stockSymbols that adjusts tickers to be compatible with Yahoo! Finance to possibly adjust some of the tickers in your file.

NOTE: stockSymbols in the version of TTR on CRAN is broken due to a change on nasdaq.com, but it is fixed in the R-forge version of TTR.

Joshua Ulrich
+1 for this great tip. Just wanted to know if there is a quick way to extract ticker names for all companies registered in NYSE and save it in a column in a flat file. I tried using it but didn't know how to extract the ticker names from the output object of stockSymbols function call. Thanks.
`write.csv(stockSymbols("NYSE")$Symbol,"NYSE_symbols.txt",row.names=FALSE,col.names=FALSE)`
Joshua Ulrich
Thank you for the tip. I did try the stockSymbols("NYSE")$Symbol command earlier but it outputs strange things, not the ticker names. I guess it must be due to CRAN version of TTR (I used install.packages() command to in stall TTR)
Yes, you probably have the CRAN version. You can use `install.packages` to install the R-forge version: `install.packages("TTR",repos="http://r-forge.r-project.org")`.
Joshua Ulrich
+1  A: 

This also a little late...If you want to grab data with just R's base functions without dealing with any add-on packages, just use the function read.csv(URL), where the URL is a string pointing to the right place at Yahoo. The data will be pulled in as a dataframe, and you will need to convert the 'Date' from a string to a Date type in order for any plots to look nice. Simple code snippet is below.

URL <- "http://ichart.finance.yahoo.com/table.csv?s=SPY"
dat <- read.csv(URL)
dat$Date <- as.Date(dat$Date, "%m/%d/%Y")

Using R's base functions may give you more control over the data manipulation.

stotastic
+1 This is a good tip. May be I am wrong but I guess it will have less flexibility to change the start date, end date, and data frequency. The URL will get more messy with that.
A few thoughts on this, you can stick all this in a utility function that constructs the URL, requests the data, and returns the data frame (there are a bunch more parameters you can send to yahoo in the url like the start date)...also since the data is in a data.frame, its pretty easy to filter and manipulate the data you don't want. Also, the 'merge' command is VERY useful in combining data frames of multiple tickers.
stotastic
@stotastic There are already several functions that do what you describe (`quantmod::getSymbols`, `TTR::getYahooData`, `tseries::get.hist.quote`, `fImport::yahooImport`, etc).
Joshua Ulrich
All nice functions that add a simplifying layer of abstraction, but I think its important to know the basics of how these functions pull in the data... just using a 'back box' can be dangerous
stotastic
Yes, it's helpful to understand how the functions work. However, you don't need to write a similar function to understand how they're pulling data. You can simply look at the source code... which isn't really a "black box".
Joshua Ulrich
I guess it is a matter of taste.
stotastic
A: 

I do it like this:Because I need to have the historic pricelist and a daily update file in order to run other packages.

fecha1<-"03/01/2009"

fecha2<-"02/02/2010"

Sys.time()

y<-format(Sys.time(), "%y")

m<-format(Sys.time(), "%m")

d<-format(Sys.time(), "%d")

fecha3<-paste(c(m,"/",d,"/","20",y), collapse="")

write.table(yahooSeries("GCI", from=fecha1, to=fecha2), file = "GCI.txt", sep="\t", quote = FALSE, eol="\r\n", row.names = TRUE)

write.table(yahooSeries("GCI", from=fecha2, to=fecha3), file = "GCIupdate.txt", sep="\t", quote = FALSE, eol="\r\n", row.names = TRUE)

GCI<-read.table("GCI.txt")

GCI1<-read.table("GCIupdate.txt")

GCI<-rbind(GCI1,GCI)

GCI<-unique(GCI)

write.table(GCI, file = "GCI.txt", sep="\t", quote = FALSE, eol="\r\n", row.names = TRUE)

René Bauch-gCapital Wealth Management

René Bauch