views:

82

answers:

3

I'd like to know what method is best suited for predicting event occurrences. For example, given a set of data from 5 years of malaria infection occurrences and several other factors that affect the occurrences, I'd like to predict the next five years for malaria infection occurrences. What I thought of doing was to derive a kind of occurrence factor using fuzzy logic rules, and then average the occurrences with the occurrence factor to get the first predicted occurrence, and then average all again with the predicted occurrence and keep on iterating for all five years, but I decided to seek for help online.

A: 

I think with your idea as stated, you'll have asymptotic behavior as time goes by. Either your data will converge to 0, or it will explode. That said, you'd probably have to give some data and/or describe its properties before anyone can help you. This is basically a simulation, and the factors are everything when it comes to extrapolation.

Stefan Mai
+2  A: 

There are many ways to do forecasting, each has its own advantages and disadvantages. The science of determining the accuracy of a forecast often consists of trying to minimize error. All forecasting comes down to using the past as a predictor of the future, adjusting it by some amount. E.g. tomorrow the temperature will be the same as today, plus or minus some amount. How you decide the +/- is what varies.

Here are a range of techniques you might want to review:

  • Moving Averages (simple, single, double)
  • Exponential Smoothing
  • Decomposition(Trend + Seasonality + Cyclicals + Irregualrities)
  • Linear Regression
  • Multiple Regression
  • Box-Jenkis (a.k.a. ARIMA, Auto-Regressive Integrated Moving Average)

Sorry, for the vague answer but forecasting is complex stuff.

What you describe about feeding your predictions back into the model to produce future predictions is standard stuff. I don't know if "fuzzy logic" gets you anything in particular. As any forecasting instructor will tell you, sometimes you just squint and look at the data. Context is everything.

caskey
+1  A: 

I would use a logit or probit model to predict occurrence given a set of exogenous circumstances. Not sure why you want to iterate. That would basically be equivalent to including a lag in the regression formula. You could do it, and as long as the coefficient was <1, you wouldn't have the explosion problem.

If you want to introduce an element of endogeneity to the independent variables, you could use a VAR.

af