views:

370

answers:

7

I'm looking for an algorithm or example material to study for predicting future events based on known patterns. Perhaps there is a name for this, and I just don't know/remember it. Something this general may not exist, but I'm not a master of math or algorithms, so I'm here asking for direction.

An example, as I understand it would be something like this:

A static event occurs on January 1st, February 1st, March 3rd, April 4th. A simple solution would be to average the days/hours/minutes/something between each occurrence, add that number to the last known occurrence, and have the prediction.

What am I asking for, or what should I study?

There is no particular goal in mind, or any specific variables to account for. This is simply a personal thought, and an opportunity for me to learn something new.

+1  A: 

The only technique I've worked with for trying to do something like that would be training a neural network to predict the next step in the series. That implies interpreting the issue as a problem in pattern classification, which doesn't seem like that great a fit; I have to suspect there are less fuzzy ways of dealing with it.

chaos
+2  A: 

There is no single 'best' canned solution, it depends on what you need. For instance, you might want to average the values as you say, but using weighted averages where the old values do not contribute as much to the result as the new ones. Or you might try some smoothing. Or you might try to see if the distribution of events fits a well-kjnown distribution (like normal, Poisson, uniform).

florin
+5  A: 

I think some topics that might be worth looking into include numerical analysis, specifically interpolation, extrapolation, and regression.

Lance Harper
I understand that there never a single-best answer, especially given such a vague or ambiguous question, tho in this particular case I think that Extrapolation is what I was looking for. Thanks!
anonymous coward
I think you mean interpolation, not interpretation.
Pete Kirkham
You're right. Fixed.
Lance Harper
A: 

You should google Genetic Programming Algorithms

They (sort of like the Neural Networks mentioned by Chaos) will enable you to generate solutions programmatically, then have the program modify itself based on a criteria, and create new solutions which are hopefully closer to accurate.

Neural Networks would have to be trained by you, but with genetic programming, the program will do all the work.

Although it is a hell of a lot of work to get them running in the first place!

+2  A: 

This could be overkill, but Markov chains can lead to some pretty cool pattern recognition stuff. It's better suited to, well, chains of events: the idea is, based on the last N steps in a chain of events, what will happen next?

This is well suited to text: process a large sample of Shakespeare, and you can generate paragraphs full of Shakespeare-like nonsense! Unfortunately, it takes a good deal more data to figure out sparsely-populated events. (Detecting patterns with a period of a month or more would require you to track a chain of at least a full month of data.)

In pseudo-python, here's a rough sketch of a Markov chain builder/prediction script:

n = how_big_a_chain_you_want
def build_map(eventChain):
    map = defaultdict(list)
    for events in get_all_n_plus_1_item_slices_of(eventChain):
        slice = events[:n]
        last = events[-1]
        map[slice].append(last)

def predict_next_event(whatsHappenedSoFar, map):
    slice = whatsHappenedSoFar[-n:]
    return random_choice(map[slice])
ojrac
A: 

if you merely want to find the probability of an event occurring after n days given prior data of its frequency, you'll want to fit to an appropriate probability distribution, which generally requires knowing something about the source of the event (maybe it should be poisson distributed, maybe gaussian). if you want to find the probability of an event happening given that prior events happened, you'll want to look at bayesian statistics and how to build a markov chain from that.

Autoplectic
+2  A: 

If you have a model in mind (such as the events occur regularly), then applying a Kalman filter to the parameters of that model is a common technique.

Pete Kirkham
+1, I'll also check into that.
anonymous coward