views:

224

answers:

4

I wrote a program in java that rolls a die and records the total number of times each value 1-6 is rolled. I rolled 6 Million times. Here's the distribution:

#of 0's: 0
#of 1's: 1000068
#of 2's: 999375
#of 3's: 999525
#of 4's: 1001486
#of 5's: 1000059
#of 6's: 999487

(0 wasn't an option.)

Is this distribution consistant with random dice rolls? What objective statistical tests might confirm that the dice rolls are indeed random enough?

EDIT: questions have been raised over application: a game that i want to be as fair as can be reasonably achieved.

+4  A: 

If your random number generator passes the Diehard tests, that's the best you can do.

Even a physical die won't be perfect with 1/6 per face.

Increase the trials by an order of magnitude, then do it again. If you get 1/6 for each trial you'll be fine.

duffymo
_if you get approximately 1/6 for each face_ ~ statistically you should never get exactly 1/6 for each face, that's not random.
drachenstern
Exactly 1/6 for each face is just as likely as any other possible result. I certainly wouldn't say someone should "never" get it as a result.
jemfinch
i'm already running 6 million trials. Isn't that increased enough? also what is the Diehard test?
David
http://en.wikipedia.org/wiki/Diehard_testsSee here for code: http://www.stat.fsu.edu/pub/diehard/
Dusty
@jemfinch, true enough. I just tend to view an exactly even split as too odd for what's expected, but you're correct. if it reads as 100,100,100,100,100,100 for 600 rolls I would think something was off on the logic. But seeing 99,100,101,100,99,101 I wouldn't have any questions. Maybe it's just me. ~ consider my previous post amended, yeah?
drachenstern
@jemfinch Exactly 1/6 for each face is highly unlikely and would be a strong indication that the generator is not random.
starblue
In addition to the Diehard or similar tests you should also run it in the desired application and check that the result appears random.
starblue
@starblue Exactly 1/6 every single run would be highly unlikely. Exactly 1/6 *once* is just as likely as any other result.
jemfinch
For the millions of rolls we are talking about getting exactly 1/6 even once is so highly unlikely that it is practically impossible. The other results are usually grouped together because they are not as distinguished, and hence have a higher probability.
starblue
+1  A: 

This test alone isn't enough to determine randomness. Not that it's completely useless, but a "random" dice roller that outputs 1,2,3,4,5,6 and repeats would be perfectly random according to this test.

Another suggested test: pick a number, x, and each time it is rolled, record the statistics of what number comes next; you should see an even distribution again. Repeat for all six values of x. If it passes this test it is probably random enough to be used as a dice roller.

Graphics Noob
+5  A: 

To test whether this particular distribution is consistent with the expected distribution of numbers rolled with a "fair" dive, you need to perform the Pearson's Chi-square test.

Note that this still will not prove that your algorithm is "fair", only that these particular results look "fair".

To test whether your algorithm is "fair" in general, use the Diehard tests, as others have mentioned.

Franci Penov
How do the diehard tests garuntee randomness?
David
The Diehard tests don't guarantee randomness. There's nothing that can guarantee randomness. :-) The Diehard tests are set of automated tests intended to run against particular random generator implementation that look for statistical proof that this particular implementation is _not_ a "fair" implementation. If your generator passes the Diehard tests, that does not mean it is "fair" and the randomness is guaranteed; it just means there's a high chance it might be "fair"
Franci Penov
A: 

The probability that 6'000'000 dice rolls will end up in exactly 1'000'000 outcomes of each is close to 0. As long as the sum if the outcomes is correct, and that the variance (error) of the difference from the expected outcome goes towards 0 (relatively) when the number of trials increase, then your random function is not wrong.

You can either prove it mathematically or by testing the random function with larger and larger trial sequences to see that it converges.

For a repeated number of tests, the sum for each outcome should approximate Gaussian distribution. E.g. each outcome 1-6 should fall within normal distribution centered around 1'000'000 with a variance that is inversely proportional to the number of dice rolls.

The other tests, the Diehard tests, tests that the actual sequence of dice rolls is random in itself and not that the outcome of 6'000'000 rolls for example is 100'000 consecutive 1's, then 100'000 2's and so on and finally some random sequences.

Ernelli
can you go into more detail about variance, what it means and how its calculated?
David
Variance is a measure for how "random" a variable is. If you test your dice randomness using 6 rounds you will get a large variance on each outcome [1-6] but if you test it 6'000'000 times the variance will be much lower. Think of variance as spread. Aynyway, the Diehard tests are more relevant for testing randomness.
Ernelli