views:

372

answers:

1

I've checked the examples in the Boost website, but they are not what I'm looking for.

To put it simple, I want to see if a number on a die is favored, using 600 rolls, so the average appearances of every number (1 through 6) should be 100.

And I want to use the chi square distribution to check if the die is fair.

Help!, how would I do this please ??

+6  A: 

Suppose e[i] and o[i] are arrays holding the expected and observed count of rolls for each of the 6 possibilities. In your case, e[i] is 100 for each bin, and o[i] is the number of times i was rolled in your 600 trials.

You then calculate the chi-squared statistic by summing (e[i]-o[i])2/e[i] over the 6 bins. Lets say your o[i] array came out with 105, 95, 102, 98, 98, and 102 counts after doing your 600 trials.

chi2 = 52/100 + 52/100 + 22/100 + 22/100 + 22/100 + 22/100 = .660

You have five degrees of freedom (number of bins minus 1). So you're going to have a declaration like

boost::math::chi_squared mydist(5);

to create the Boost object representing your chi-square distribution.

At this point you would use the cdf accessor function (cumulative distribution function) from the Boost library to look up the p-value corresponding to a chi-squared score of .660 with five degrees of freedom.

p = boost::math::cdf(mydist,.660);

You should get something close to 0.015, which would be interpreted as a (1 - .015) = 98.5% probability of observing a chi-squared score at least as extreme as 0.660, if one assumes the null hypothesis (that the die is fair) holds. So for this set of data, the null hypothesis cannot be rejected with any reasonable confidence level. (Disclaimer: untested code! But if I understand the Boost documentation correctly, this is how it should work.)

Jim Lewis
From wikipedia: "The p-value is not the probability that the null hypothesis is true."..." the p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true."
telliott99
@telliott99: You are right...I've reworded that section a bit to clarify the interpretation of the hypothetical test results.
Jim Lewis