ansaurus

Question

How to generate normally distributed random from an integer range?

Answer 1

+2 A:

If the Box-Muller method returns a "standard" normal distribution, it will have mean 0 and standard deviation 1. To transform a standard normal distribution, you multiply your random number by X to get standard deviation X, and you add Y to obtain mean Y, if memory serves me correctly.

See the Wikipedia article's section on normalizing standard normal variables (property 1) for a more formal proof.

In response to your comment, the rule of thumb is that 99.7% of a normal distribution will be within +/- 3 times the standard deviation. If you need a normal distribution from 0 to 100 for instance, than your mean will be halfway, and your SD will be (100/2)/3 = 16.667. So whatever values you get out of your Box-Muller algorithm, multiply by 16.667 to "stretch" the distribution out, then add 50 to "center" it.

John, in response to your newest comment, I'm really not sure what is the point of the Next function. It always uses a standard deviation of 1 and a mean of halfway between your min and max.

If you want a mean of Y, with ~99.7% of the numbers in the range -X to +X, then you just call BoxMuller(Y, X/3).

Mark Rushakoff 2009-08-20 00:48:50

Yes, this is what the BoxMuller(double mean, double standard_deviation) achieves above. The problem, however, is that the Next(int min, int max) returns values very close to the half way of the range. This is because I don't understand how to "scale" the deviation correctly.

2009-08-20 00:56:00

So for 3.5 standard deviations, it would be "return (int)BoxMuller(min + (max - min) / 2.0, (max - min) / 2.0 / 3.5);"?

2009-08-20 16:37:20

Can I just comment that +/- 3 times standard deviation gives you not 97% but 99.7%.+/-sigma: ~68%+/-2sigma: ~95%+/-3sigma: ~99.7%http://en.wikipedia.org/wiki/68-95-99.7_rule

DmitryK 2009-08-20 23:55:13

Thanks for pointing that out, Dmitry -- my statistics were a bit rusty, apparently. I updated the answer.

Mark Rushakoff 2009-08-21 01:18:04

Answer 2

A:

Well, the -2*sigma..+2*sigma will give you 95% of the bell curve. (check the "Standard deviation and confidence intervals" section in the already mentioned wiki article).

So modify this piece:

return (int)BoxMuller(min + (max - min) / 2.0, 1.0);

and change 1.0 (standard deviation) to 2.0 (or even more if you want more than 95% coverage)

return (int)BoxMuller(min + (max - min) / 2.0, 2.0);

DmitryK 2009-08-20 01:31:38

Now I understand what I want to formulate thanks to you.. I want the -2*sigma .. +2*sigma to be at the beginning and the end of the range (min, max) respectively.

2009-08-20 02:10:27

Ah, I understand now. So you want to step out from your mean value left and right and "hit" either min or max with a 95% confidence.In this case - you keep your mean as you have it (min + (max-min)/2), but you need to calculate your sigma (standard deviation). Stepping out by 2*sigma gives us that 95% interval. So the length of this interval is 4*sigma. But we can also calculate it as (max-min). Which gives us sigma=(max-min)/4.Can you please try that?

DmitryK 2009-08-20 04:55:39

"hit" within the min..max range with the 95% confidence - just to be precise in my wording.

DmitryK 2009-08-20 04:56:34

This works also, too bad I can only make one answer as correct. Stackoverflow should add "assisted" or "shared" answers.

2009-08-20 16:39:41

Answer 3

A:

You need to work out how much of your normal distribution must be within your 'range'. You can't say all of the distribution should be within the range, because as you note the distribution runs to infinity.

Instead you need to specify that, e.g. 99% of the distribution lives within your range, or 99.99% lives within your range. This will in turn allow you to calculate your required standard deviation. This is quite important as it will strongly influence the chance of distribution of values near your range boundaries.

Obviously your 'mean' value will be the midpoint of your range.

Kirk Broadhurst 2009-08-20 02:09:46

ansaurus

tags:

views:

answers:

How to generate normally distributed random from an integer range?

related questions