tags:

views:

356

answers:

2

I am running Python 2.6.5 on Mac OS X 10.6.4 (this is not the native version, I installed it myself) with Scipy 0.8.0. If I do the following:

>>> from scipy.stats import hypergeom
>>> hypergeom.sf(5,10,2,5)

I get an IndexError. Then I do:

>>> hypergeom.sf(2,10,2,2)
-4.44....

I suspect the negative value is due to bad floating point precision. Then I do the first one again:

>>> hypergeom.sf(5,10,2,5)
0.0

Now it works! Can someone explain this? Are you seeing this behavior too?

+1  A: 

I don't know python, but the function is defined like this: hypergeom.sf(x,M,n,N,loc=0)

M is the number of interesting objects, N the total number of objects, and n is how often you "pick one" (Sorry, German statistician).

If you had a bowl with 20 balls, 7 of those yellow (an interesting yellow), then N is 20 and M is 7.

Perhaps the function behaves undefined for the (nonsense) case when M>N ?

Alexx Hardt
The function as defined in python is well defined for the values of M,n,N used. From the docstring in python for scipy.stats.hypergeom, M is the total number of objects, n is number of type 1 objects, and N are drawn without replacement. So probs are hypergeom(x=0,10,2,5)=2/9, hypergeom(x=1,10,2,5)=5/9, hypergeom(x=2,10,2,5)=2/9; so the survival function for x<0 is 0, its 7/9 for 0 <= x < 1, 2/9 for 1 <= x < 2, and 0 for 2 <= x. For the sf (survival function, read as 1-cdf, cumulative distribution function) of the hypergeometric distribution, we know the answer should be 0.
jimbob
+2  A: 
jimbob