views:

134

answers:

4

Is there a nice way to get randomly generated integers with normal distribution?

The first method which come to mi mind:

int rndi = (int)Math.floor(random.nextGaussian()*std);

Is there better way? Thanks

+1  A: 

That depends on what you are trying to do with those random numbers.

The java.util.Random has some flaws. As stated in JavaDoc, the nextGaussian() method uses Box Muller Transform. It depends on Random.nextDouble() which is implemented using Linear Congruential generator. And the implementation is not the best one, as stated in a bugfix proposal:

Sun's method uses a 48 bit seed and (as far as the bottom bit is concerned) only accesses 17 bits of this - producing extremely severe non-randomness.

So if you are interested in high statistical quality you should really avoid Sun's implementation. Take a look at this "Not so random" applet for visual proof of how bad it is.

If statistical quality is a concern to You, the best you can do is use some external PRNG library.

Rekin
The pathological behavior highlighted in the article cited is a limitation of the low order bits produced by linear congruential generators when _m_ is a power of two; it is not specific to Sun's implementation. When used correctly, `java.util.Random` provides acceptable results for many applications. http://en.wikipedia.org/wiki/Linear_congruential_generator
trashgod
Sure, that's correct. For this use it's definitely sufficient. The main question (without latter comment) was vague and I rushed with the answer.
Rekin
+3  A: 

Strictly speaking, you can't have normally distributed integers. Maybe what you want is the output of a normal distribution sorted into buckets. In that case, you probably want to shift and scale your normal distribution according to the size of your array. If you just take samples from a standard normal distribution (mean = 0 and scale = 1), you'll get samples between -2 and 2 around 99% of the time.

Suppose you want random samples from an array of size N. You want the entries in the middle to be chosen more often than the samples at the end, but you want the samples near the ends to come up occasionally, say 1% of the time. Then you may want to compute something like N/2 + N*z/4 where z is your standard normal then cast those numbers to an integer. If you do this, you'll occasionally get an index outside your array. Just test for that and get a new value when that happens.

John D. Cook
+1  A: 

You can precompute a list of "random" integers, then hand tweak that list to get the distribution you want.

Then when you want a "random" number, just pull the next available one from the list...

This way you ensure the distribution and therefore the probability of a particular item being selected. For fun, you can just "mix up" your list whenever you need.

Chris Lively
+1 More here: http://stackoverflow.com/questions/2343296
trashgod
@trashgod: great link.
Chris Lively
+1  A: 

You should update the question to make clear what's exactly your use case.

According to your comment, you shouldn't be using normal distribution at all. Instead try one of many discrete distributions, since you want integers at the end. There are a lot of those, but I'd recommend one - very simple. It uses stochastic vector as the discrete probability distribution.

Here's example implementation:

public class DiscreteRandom {

    private final double[] probDist;

    public DiscreteRandom(double... probs) {
        this.probDist = makeDistribution(probs);
    }

    private double[] makeDistribution(double[] probs) {
        double[] distribution = new double[probs.length];
        double sum = 0;
        for (int i = 0; i < probs.length; i++) {
            sum += probs[i];
            distribution[i] = sum;
        }
        return distribution;
    }

    public int nextInt() {
        double rand = Math.random();
        int i = 0;
        while (rand > probDist[i]) i++;
        return i;
    }

    /**
     * Simple test
     */
    public static void main(String[] args) {
        // We want 0 to come 3 times more often than 1.
        // The implementation requires normalized probability
        // distribution thus testProbs elements sum up to 1.0d.
        double[] testProbs = {0.75d, 0.25d};
        DiscreteRandom randGen = new DiscreteRandom(testProbs);

        // Loop 1000 times, we expect:
        // sum0 ~ 750
        // sum1 ~ 250
        int sum0 = 0, sum1 = 0, rand;
        for (int i = 0; i < 1000; i++) {
            rand = randGen.nextInt();
            if (rand == 0) sum0++;
            else           sum1++;
        }
        System.out.println("sum0 = " + sum0 + "sum1 = " + sum1);
    }
}
Rekin
a few comments: (1) use a TreeMap going from cumulative probability distribution to integers, and in nextInt() use TreeMap.floorEntry. (2) Your approach only handles N integers from 0 to N-1. What if the OP wants 2N+1 integers from -N to +N, or N integers from 1000 to 1000+N-1?
Jason S
(1) - great optimization, I always wondered how to do it properly. Thanks. And about (2) I think some wrapping method could handle the case without too much coding.
Rekin