views:

475

answers:

4

Hello, The expected probability of randomly selecting an element from a set of n elements is P=1.0/n . Suppose I check P using an unbiased method sufficiently many times. What is the distribution type of P? It is clear that P is not normally distributed, since cannot be negative. Thus, may I correctly assume that P is gamma distributed? And if yes, what are the parameters of this distribution? Histogram of probabilities of selecting an element from 100-element set for 1000 times is shown here.

Is there any way to convert this to a standard distribution

Now supposed that the observed probability of selecting the given element was P* (P* != P). How can I estimate whether the bias is statistically significant?

EDIT: This is not a homework. I'm doing a hobby project and I need this piece of statistics for it. I've done my last homework ~10 years ago:-)

A: 

Is that a "discrete uniform distribution?"

http://en.wikipedia.org/wiki/Uniform_distribution_(discrete)

rice
+3  A: 

This is a clear binomial distribution with p=1/(number of elements) and n=(number of trials).

To test whether the observed result differs significantly from the expected result, you can do the binomial test.

The dice examples on the two Wikipedia pages should give you some good guidance on how to formulate your problem. In your 100-element, 1000 trial example, that would be like rolling a 100-sided die 1000 times.

Randy
+2  A: 

With repetitions, your distribution will be binomial. So let X be the number of times you select some fixed object, with M total selections

P{ X = x } = ( M choose x ) * (1/N)^x * (N-1/N)^(M-x)

You may find this difficult to compute for large N. It turns out that for sufficiently large N, this actually converges to a normal distribution with probability 1 (Central Limit theorem).

In case P{X=x} will be given by a normal distribution. The mean will be M/N and the variance will be M * (1/N) * ( N-1) / N.

Ying Xiao
+1  A: 

As others have noted, you want the Binomial distribution. Your question seems to imply an interest in a continuous approximation to it, though. It can actually be approximated by the normal distribution, and also by the Poisson distribution.

fivebells