views:

928

answers:

5

I searched the site but did not find exactly what I was looking for... I wanted to generate a discrete random number from normal distribution.

For example, if I have a range from a minimum of 4 and a maximum of 10 and an average of 7. What code or function call ( Objective C preferred ) would I need to return a number in that range. Naturally, due to normal distribution more numbers returned would center round the average of 7.

As a second example, can the bell curve/distribution be skewed toward one end of the other? Lets say I need to generate a random number with a range of minimum of 4 and maximum of 10, and I want the majority of the numbers returned to center around the number 8 with a natural fall of based on a skewed bell curve.

Any help is greatly appreciated....

Anthony

+1  A: 

The normal distribution is not described by its endpoints. Normally it's described by it's mean (which you have given to be 7) and its standard deviation. An important feature of this is that it is possible to get a value far outside the expected range from this distribution, although that will be vanishingly rare, the further you get from the mean.

The usual means for getting a value from a distribution is to generate a random value from a uniform distribution, which is quite easily done with, for example, rand(), and then use that as an argument to a cumulative distribution function, which maps probabilities to upper bounds. For the standard distribution, this function is

F(x) = 0.5 - 0.5*erf( (x-μ)/(σ * sqrt(2.0)))

where erf() is the error function which may be described by a taylor series:

erf(z) = 2.0/sqrt(2.0) * Σn=0 ((-1)nz2n + 1)/(n!(2n + 1))

I'll leave it as an excercise to translate this into C.

If you prefer not to engage in the exercise, you might consider using the Gnu Scientific Library, which among many other features, has a technique to generate random numbers in one of many common distributions, of which the Gaussian Distribution (hint) is one.

Obviously, all of these functions return floating point values. You will have to use some rounding strategy to convert to a discrete value. A useful (but naive) approach is to simply downcast to integer.

TokenMacGuy
You actually don't need to write an implementation of erf(x) in C as it is included in C99 in math.h. Also using a Taylor series for calculation is a nightmare as it converges only *very* slowly. Using a continued fraction approach here is the usual solution (cf. Numerical Recipes in C or the various mathematical libraries like Cephes or fdlibm).
Joey
eh-heh. Got all of this info from google. mostly wikipedia. Also I mentioned that C implementation was to be left as an exercise!
TokenMacGuy
Well, it'd be a pointless exercise as the resulting code won't have much use. There's a reason no-one uses a Taylor series to compute erf(x) :-) (both in speed and accuracy)
Joey
+1  A: 

Many frameworks and libraries have this built-in.

Also, just like TokenMacGuy said a normal distribution isn't characterized by the interval it's defined on, but rather by two parameters: Mean μ and standard deviation σ. With both these parameters you can confine a certain quantile of the distribution to a certain interval, so that 95 % of all points fall in that interval. But resticting it completely to any interval other than (−∞, ∞) is impossible.

There are several methods to generate normal-distributed values from uniform random values (which is what most random or pseudorandom number generators are generating:

  • The Box-Muller transform is probably the easiest although not exactly fast to compute. Depending on the number of numbers you need, it should be sufficient, though and definitely very easy to write.

  • Another option is Marsaglia's Polar method which is usually faster1.

  • A third method is the Ziggurat algorithm which is considerably faster to compute but much more complex to program. In applications that really use a lot of random numbers it may be the best choice, though.

As a general advice, though: Don't write it yourself if you have access to a library that generates normal-distributed random numbers for you already.


For skewing your distribution I'd just use a regular normal distribution, choosing μ and σ appropriately for one side of your curve and then determine on which side of your wanted mean a point fell, stretching it appropriately to fit your desired distribution.


For generating only integers I'd suggest you just round towards the nearest integer when the random number happens to fall within your desired interval and reject it if it doesn't (drawing a new random number then). This way you won't artificially skew the distribution (such as you would if you were clamping the values at 4 or 10, respectively).


1 In testing with deliberately bad random number generators (yes, worse than RANDU) I've noticed that the polar method results in an endless loop, rejecting every sample. Won't happen with random numbers that fulfill the usual statistic expectations to them, though.

Joey
+4  A: 

What do you need this for? Can you do it the craps player's way?

Generate two random integers in the range of 2 to 5 (inclusive, of course) and add them together. Or flip a coin (0,1) six times and add 4 to the result.

Summing multiple dice produces a normal distribution (a "bell curve"), while eliminating high or low throws can be used to skew the distribution in various ways.

The key is you are going for discrete numbers (and I hope you mean integers by that). Multiple dice throws famously generate a normal distribution. In fact, I think that's how we were first introduced to the Gaussian curve in school.


Of course the more throws, the more closely you approximate the bell curve. Rolling a single die gives a flat line. Rolling two dice just creates a ramp up and down that isn't terribly close to a bell. Six coin flips gets you closer.

So consider this...

If I understand your question correctly, you only have seven possible outcomes--the integers (4,5,6,7,8,9,10). You can set up an array of seven probabilities to approximate any distribution you like.

Nosredna
+1  A: 

Yes, there are sophisticated mathematical solutions, but for "simple but practical" I'd go with Nosredna's comment. For a simple Java solution:

Random random=new Random();
public int bell7()
{
  int n=4;
  for (int x=0;x<6;++x)
    n+=random.nextInt(2);
  return n;
}

If you're not a Java person, Random.nextInt(n) returns a random integer between 0 and n-1. I think the rest should be similar to what you'd see in any programming language.

If the range was large, then instead of nextInt(2)'s I'd use a bigger number in there so there would be fewer iterations through the loop, depending on frequency of call and performance requirements.

Jay
+1  A: 

Dan Dyer and Jay are exactly right. What you really want is a binomial distribution, not a normal distribution. The shape of a binomial distribution looks a lot like a normal distribution, but it is discrete and bounded whereas a normal distribution is continuous and unbounded.

Jay's code generates a binomial distribution with 6 trials and a 50% probability of success on each trial. If you want to "skew" your distribution, simply change the line that decides whether to add 1 to n so that the probability is something other than 50%.

Yakbutter