ansaurus

Question

Determining the chances of an event occurring when it hasn't occurred yet

Answer 1

+2 A:

Assuming you keep data on past impressions and clicks, it's easy: let's say that you have an impression, and a time d' has passed since that impression. You can divide your data into three groups:

Impressions which received a click in less than d'
Impressions which received a click after more than d'
Impressions which never received a click

Clearly the current impression is not in group (1), so eliminate that. You want the probability it is in group (2), which is then

P = N2 / (N2 + N3)

where N2 is the number of impressions in group 2, and similarly for N3.

As far as actual implementation, my first thought would be to keep an ordered list of the times d for past impressions which did receive clicks, along with a count of the number of impressions which never received a click, and just do a binary search for d' in that list. The position you find will give you N1, and then N2 is the length of the list minus N1.

If you don't need perfect granularity, you can store the past times as a histogram instead, i.e. a list that contains, in each element list[n], the number of impressions that received a click after at least n but less than n+1 minutes. (Or seconds, or whatever time interval you like) In that case you'd probably want to keep the total number of clicks as a separate variable so you can easily compute N2.

(By the way, I just made this up, I don't know if there are standard algorithms for this sort of thing that may be better)

David Zaslavsky 2010-05-03 18:35:44

Answer 2

+1 A:

See this article:

Estimating the chances of something that hasn't happened yet.

John D. Cook 2010-05-03 19:21:34

Answer 3

A:

I would suggest hypothesizing an arrival process (clicks per minute) and trying to fit a distribution to that arrival process using your existing data. I'll bet the result is negative binomial which is what you get when you have a poisson arrival process with a non-stationary mean if the mean has a gamma distribution. The inverse (minutes per click) gives you the distribution of the interarrival process. Don't know if there's a distribution named for that, but you can create an empirical one.

Hope this helps.

Grembo 2010-05-04 21:50:27

ansaurus

tags:

views:

answers:

Determining the chances of an event occurring when it hasn't occurred yet

related questions