views:

585

answers:

1
data <-c(88, 84, 85, 85, 84, 85, 83, 85, 88, 89, 91, 99, 104, 112, 126, 138, 146,151,   150, 148, 147, 149, 143, 132, 131, 139, 147, 150, 148, 145, 140, 134, 131, 131, 129, 126, 126, 132, 137, 140, 142, 150, 159, 167, 170, 171, 172, 172, 174, 175, 172, 172, 174, 174, 169, 165, 156, 142, 131, 121, 112, 104, 102, 99, 99, 95, 88, 84, 84, 87, 89, 88, 85, 86, 89, 91, 91, 94, 101, 110, 121, 135, 145, 149, 156, 165, 171, 175, 177, 182, 193, 204, 208, 210, 215, 222, 228, 226, 222, 220)

Why do the ARMA models acting on the first differences of the data differ from the corresponding ARIMA models?

for (p in 0:5)
{
for (q in 0:5)
{
#data.arma = arima(diff(data), order = c(p, 0, q));cat("p =", p, ", q =", q, "AIC =",  data.arma$aic, "\n");
data.arma = arima(data, order = c(p, 1, q));cat("p =", p, ", q =", q, "AIC =", data.arma$aic, "\n");
}
}

Same with Arima(data,c(5,1,4)) and Arima(diff(data),c(5,0,4)) in the forecast package. I can get the desired consistency with

auto.arima(diff(data),max.p=5,max.q=5,d=0,approximation=FALSE, stepwise=FALSE, ic ="aic", trace=TRUE);
auto.arima(data,max.p=5,max.q=5,d=1,approximation=FALSE, stepwise=FALSE, ic ="aic", trace=TRUE);

but it seems the holder of the minimum AIC estimate for these data has not been considered by the algorithm behind auto.arima; hence the suboptimal choice of ARMA(3,0) instead of ARMA(5,4) acting on the first differences. A related question is how much the two AIC estimates should differ before one considers one model better than the other has little to do wuth programming - the smallest AIC holder should at least be considered/reported, even though 9 coefficients may be a bit too much for a forecast from 100 observations.

My R questions are:

1) Vectorised version of the double loop so it is faster?

2) Why does arima(5,1,4) acting on the data differ from arma(5,4) acting on the first differences of the data? Which one is to be reported?

3) How do I sort the AICs output so that the smaller come first?

Thanks.

+5  A: 

There are a lot of questions and issues raised here. I'll try to respond to each of them.

Arima() is just a wrapper for arima(), so it will give the same model.

arima() handles a model with differencing by using a diffuse prior. That is not the same as just differencing the data before fitting the model. Consequently, you will get slightly different results from arima(x,order=c(p,1,q)) and arima(diff(x),order=c(p,0,q)).

auto.arima() handles differencing directly and does not use a diffuse prior when fitting. So you will get the same results from auto.arima(x,d=1,...) and auto.arima(diff(x),d=0,...)

auto.arima() has an argument max.order which specifies the maximum of p+q. By default, max.order=5, so your arima(5,1,4) would not be considered. Increase max.order if you want to consider such large models (although I wouldn't recommend it).

You can't vectorize a loop involving nonlinear optimization at each iteration.

If you want to sort your output, you'll need to save it to a data.frame and then sort on the relevant column. The code currently just spits out the results as it goes and nothing is saved except for the most recent model fitted.

Rob Hyndman
Thanks, Rob. With max.order=9 added, the AIC for ARIMA(5,1,4)/ARMA(5,4) is 1e+20, whatever it means, so it still selects ARIMA(3,1,0)/ARMA(3,0) as the best.
knot
auto.arima() returns an AIC of 1e20 (i.e., 10^20) when there are problems with the fit. It may be a convergence problem, or the parameters may be near the boundaries of stationarity and invertibility. These signal that the model is likely to have problems and is better not to be used.
Rob Hyndman
As always, thanks for your contributions to stackoverflow, Rob!
griffin
Ok. The AICs differ by almost 1. The plots for ARMA(3,0) and ARMA(5,4) look alike. Anyone with brain would choose the former, and anything based solely on the AIC would suggest the latter. Perhaps a similar or even more interesting example prompted Leo Breiman to say that "automatic methods of model selection are to be shunned or, if use is absolutely unavoidable, are to be examined carefully...". The AIC can mislead. Arima(diff(data),c(5,0,4))plot(arima.sim(n = 630, list(ar = c( 0.3999, -0.4881, 0.0388, -0.2539, 0.5874 ), ma = c(0.7173, 0.7831, 0.7173, 0.9999)),sd = sqrt(7.436)))
knot
@Knot; only if you use AIC in the sense of "the lower the better". Models within 2 AIC units difference are essentially equivalent in their fits relative to complexity. I would read nothing special into models that differed by "almost 1" in AIC. If I wanted "one" model then I'd go with the simpler model. If I was into model averaging, I might keep both models and consider them as two different models that have similar levels of parsimony. All models are wrong after all...
Gavin Simpson