tags:

views:

42

answers:

1

Hello,

I have some irregularly spaced data, say table A. The frequency is every 2-5 days. I have another data set, table B, which has entries for every weekday. I want to run the following regression:

A_{t} = alpha + beta1 * B_{t-2 months} + error

where, when I lag B, if there isn't something that isn't exactly 60 days ago, e.g. if 60 days ago was a Sunday, then just pick the next Monday. I can of course construct this w/ a for loop, but what is the R way. Currently, the data are store in MySQL tables and I am using RMySQL to access.

Thanks for the help.

+3  A: 

You want the zoo package and its documentation --- which has numerous examples about how to aggregate, align, transform, ... data along the time dimension.

It is a hard problem. You'll have to think about how you do it --- but at least appropriate and powerful tools exist. There are also plenty of usage examples here and on the R lists.

At a minimum, you could use na.locf() to carry your last irregular observation forward to the next regular one (after having merged the data based on daily dates). You can then use lag() operators on the regular data. Also, packages dynlm and dyn facilitate modeling with lm() on data help in zoo objects by adding lags etc to the formula interface.

Dirk Eddelbuettel
Thank you for the help
stevejb