views:

73

answers:

4

I have a program which records events that occur with some probability p. After I run it I get k events recorded. How can I calculate how many events there were, recorded or not, with some confidence, say 95%?

So for example, after getting 13 events recorded I would like to be able to calculate that there were between 13 and 19 events total with 95% confidence.

A: 

I'm pretty sure your process is the same as a binomial process - the probability p of an event being recorded can be considered a success. I don't think there's a need to elaborate further on the underlying process.

The twist in your problem is that you don't know the value of n, only k and p. Confidence interval calculations typically assume you know n & p and you want a confidence interval around k, the number of successes. See here.

Given k and p, you should be able to determine the probabiilty distribution of n, q(n), then create a distribution of k given known p and q(n). This distribution of k will yield a confidence interval, right?

Grembo
That's right, but how can I determine the probability distribution of n? I know the answer is around n = k/p but not how it's distributed around that.
Statec
A: 

If p is between 0 and 1:

(1/p) * k = typical number of actual events

If your random() is PERFECT, it will ALWAYS be true. However, this is not usually the case.

For a LARGE k (the larger, the more accurate the result base don percentage off) it will be CLOSE to the actual number, though it is doubtful that it will hit it exactly.

TaslemGuy
Yes, you are right. I know k/p is the expected number of total events. What I am asking is how to compute an interval around k/p where the number of actual events is almost sure (95%) to be in.
Statec
A: 

The problem with your statement is that you are saying there is a know probablitiy of the event. If that is know and you know how many events you saw there is no error in how many events there were. Do you know how many recordings there were?

I think you need to reframe the way you are asking the question or try to estimate something different.

Or are you saying your recording only happens 60% of the time when a true event happens. What is it you are measuring and what constitutes an event. An analogy would be ok - but the way it is formulated now there is no way to construct a confidence interval on the true number of events.

TheSteve0
A: 

Here is the answer that Andrew Walker gave on the stats site. I am going to accept this as the answer to this question. Thanks to everyone.

Statec