views:

1266

answers:

9

For a game I'm trying to determine the frequency that a certain # will show up at a given # of dice being rolled. I know... that question seems odd. Let me try to explain it with real numbers.

So, for 1 die, the frequency for each number will be identical. 1-6 will show up equal number of times.

Now for 2 dice, things get different. I imagine 5,6,7 are going to be the most frequently rolled, while numbers at both ends of the spectrum will show up less or not at all (in the case of 1). I'd like to know how to calculate this list and show them in the proper order, from most frequent to less frequent.

Any thoughts?

+10  A: 

There are 6*6 = 36 combinations for two dice.

2 = 1+1 can only appear once, so its frequency is 1/36. 3 = 1+2 or 2+1, so its frequency is 2/36 = 1/18. 4 = 1+3, 2+2, or 3+1, so its frequency is 3/36 = 1/12.

You can do the rest out to twelve.

Any backgammon player knows these well.

duffymo
+2  A: 

There's lots of stuff online about dice probability. Here's one link that helped me out with a Project Euler question:

http://gwydir.demon.co.uk/jo/probability/calcdice.htm

Dana
A: 

@duffymo - It would be really nice though to have some sort of an algorithm to come up with it. It seems that the above way is going to require a lot of hand picking and placing of numbers. If my die count is dynamic up to say 10, doing that by hand will be ineffecient and troublesome I think. :)

Chu
My comment provides an algorithm that works.
mquander
I disagree - why write code to calculate such a thing each time when a one-time hand calculation that you load into a static table works just as well? I think a table lookup will be more efficient than anything you could write.
duffymo
However, if you have a lot of dice with a lot of sides, it may be quite slow, since it makes sides^dice function calls; for example, 8 dice with 6 sides each took 5 seconds on my machine. I'll leave it as an exercise to you to find a fast mathematical way to calculate the frequencies.
mquander
I agree that the best way to do it is probably to precompute it and then hardcode it.
mquander
Pre-computing it could become error prone. I could easily miss a combination then get it all wrong. Especially if we are talking a large dynamic number.
Chu
Don't pre-compute it by hand! Just use the math (I think there are a couple of references to it) to make sure that the numbers are right. (or, alternatively, if you don't pre-compute it, just use the math in an O(1) algorithm...rather than the correct but *very* inefficent code in the "answer".)
Beska
+1  A: 

Rough draft of a recursive way to do it:

public static IEnumerable<KeyValuePair<int, int>> GetFrequenciesByOutcome(int nDice, int nSides)
{
    int maxOutcome = (nDice * nSides);
    Dictionary<int, int> outcomeCounts = new Dictionary<int, int>();
    for(int i = 0; i <= maxOutcome; i++)
        outcomeCounts[i] = 0;

    foreach(int possibleOutcome in GetAllOutcomes(0, nDice, nSides))
        outcomeCounts[possibleOutcome] = outcomeCounts[possibleOutcome] + 1;

    return outcomeCounts.Where(kvp => kvp.Value > 0);
}

private static IEnumerable<int> GetAllOutcomes(int currentTotal, int nDice, int nSides)
{
    if (nDice == 0) yield return currentTotal;
    else
    {
        for (int i = 1; i <= nSides; i++)
            foreach(int outcome in GetAllOutcomes(currentTotal + i, nDice - 1, nSides))
                yield return outcome;
    }
}

Unless I'm mistaken, this should spit out KeyValuePairs organized like [key, frequency].

EDIT: FYI, after running this, it shows the frequencies for GetFrequenciesByOutcome(2, 6) to be:

2: 1

3: 2

4: 3

5: 4

6: 5

7: 6

8: 5

9: 4

10: 3

11: 2

12: 1

mquander
I wonder why this was downvoted, as nobody else has provided an algorithm.
mquander
Not true - see reply #1.
duffymo
#1 is a calculation, not an algorithm. It has been suggested that the calculation be done manually and then the table hard-coded (ouch) - I'd much rather have a real algorithm like this than have to manually calculate the numbers and hard code them.
Chu
The algorithm is fairly straightforward like bubblesort is...and with similar efficency problems. It is the *only* algorithm presented so far, but for large numbers it's going to be far more intensive than needed. Better to just code the math behind it. (Some other posts have links to the math).
Beska
That's right, the math is better. The algorithm is just a straightforward simulation of every possible dice roll, which, as you can imagine, could be a lot of dice rolls.
mquander
A: 

@ALL -

If you've played Settlers of Catan, you'll notice that some of the numbers on the board are better to build around because those numbers show up more often.

@JoshFinnie -

If I rolled my dice 10 times in a row, 5,6,7 would be more likely to show up more often than 2 or 12. So saying 5,6,7 are equal to 2 and 12 seems... odd.

Chu
+5  A: 

There is no real "algorithm" or simulation necessary - it's a simple calculation based on a formula derived by De Moivre:

http://www.mathpages.com/home/kmath093.htm

And it's not a "bell curve" or normal distribution.

Cade Roux
Not a bell curve? You sure about that? I mean, it's definitely shaped like a bell, although I'll admit that my statistics are weak enough that I could be convinced it wasn't "normal" distribution, and maybe not the classic "bell curve".
Beska
(BTW, great link...I'll vote it up if you can convince me that it's not really a bell curve.)
Beska
It's a discrete distribution, so it couldn't be a true normal distribution. However, like most distributions, it is asymptotically normal - for large numbers of dice, the normal distribution is a better approximation (and computationally easier).
That's fine...that's what I thought and I hedged that way in my post: that it approximates a normal distribution. That was the problem I had with this answer...it implies they're two different beasts, rather than just a matter of stretching the iterations out to infinity.
Beska
The larger the number of dice, the closer the approximation to a normal distribution. For this reason (the factorials in the combinatorics) for large numbers of dice the normal distribution can be used more quickly.
Cade Roux
A: 

There seems to be some mystery surrounding exactly "why" this is, and although duffymo has explained part of it, I'm looking at another post that says:

There should be no reason why 5, 6 and 7 should be rolled more [than 2] since the first roll of the die is a independent event from the second roll of the die and both of them have equal probablity of 1-6 of being rolled.

There's a certain appeal to this. But it's incorrect...because the first roll affects the chances. The reasoning can probably most easily be done through an example.

Say I'm trying to figure out if the probability of rolling 2 or 7 is more likely on two dice. If I roll the first die and get a 3, what are my chances now of rolling a total of 7? Obviously, 1 in 6. What are my chances of rolling a total of 2? 0 in 6...because there's nothing I can roll on the second die to have my total be 2.

For this reason, 7 is very (the most) likely to be rolled...because no matter what I roll on the first die, I can still reach the correct total by rolling the right number on the second die. 6 and 8 are equally slightly less likely, 5 and 9 more so, and so on, until we reach 2 and 12, equally unlikely at 1 in 36 chance apiece.

If you plot this (sum vs likelyhood) you'll get a nice bell curve (or, more precisely, a blocky aproximation of one because of the discrete nature of your experiment).

Beska
+1  A: 

JavaScript implementation using dynamic function creation:

<script>
var f;
function prob(dice, value)
 {
var f_s = 'f = function(dice, value) {var occur = 0; var a = [];';
for (x = 0; x < dice; x++)
 {
f_s += 'for (a[' + x + '] = 1; a[' + x + '] <= 6; a[' + x + ']++) {';
 }
f_s += 'if (eval(a.join(\'+\')) == value) {occur++;}';
for (x = 0; x < dice; x++)
 {
f_s += '}';
 }
f_s += 'return occur;}';
eval(f_s);
var occ = f(dice, value);
return [occ, occ + '/' + Math.pow(6, dice), occ / Math.pow(6, dice)];
 };

alert(prob(2, 12)); // 2 die, seeking 12
                    // returns array [1, 1/36, 0.027777777777777776]
</script>

EDIT: Rather disappointed nobody pointed this out; had to replace 6 * dice with Math.pow(6, dice). No more mistakes like that...

Hexagon Theory
I'm curious as to why you chose this particular exclamation. : )
Hexagon Theory
Because this is a much more pleasantly entertaining way to write code than what I tend toward :)
mquander
I like to think so, too. Only just recently discovered dynamic function creation/modification in JavaScript; glad to find a use for it.
Hexagon Theory
+1  A: 

Neat factoid...

Did you know that Pascal's triangle is the probability distribution of the sums of N 2-sided dice?

   1 1    - 1 die, 1 chance at 1, 1 chance at 2
  1 2 1   - 2 dice, 1 chance at 2, 2 chances at 3, 1 chance at 4
 1 3 3 1  - 3 dice, 1 chance at 3, 3 chances at 4, 3 chances at 5, 1 chance at 6 
1 4 6 4 1 - etc.
Instantsoup
This is neat! Now I'm thinking about how to generalize this iterative thought process (how you create one row after the next in Pascal's triangle) to M-sided dice.
mquander