ansaurus

Question

Generating a probability distribution

Answer 1

+15 A:

Get n random numbers, calculate their sum and normalize the sum to 1 by dividing each number with the sum.

Kobi 2010-01-31 09:01:05

Nice :) didn't think of that... תודה!

Yuval A 2010-01-31 09:04:38

This introduces bias. You cannot sample uniformly from a simplex this way.

dreeves 2010-06-09 18:54:08

@dreeves - can you elaborate?

Yuval A 2010-06-09 19:36:00

Figure 2 of this paper -- http://www.cs.cmu.edu/~nasmith/papers/smith+tromble.tr04.pdf -- gives a visual depiction of the bias in the n=3 case of the normalization method you propose. I believe that kohomologie's answer is correct (though I didn't check the code they included).

dreeves 2010-06-09 23:30:35

Answer 2

+4 A:

The task you are trying to accomplish is tantamount to drawing a random point from the N-dimensional unit simplex.

http://en.wikipedia.org/wiki/Simplex#Random_sampling might help you.

A naive solution might go as following:

public static double[] getArray(int n)
    {
        double a[] = new double[n];
        double s = 0.0d;
        Random random = new Random();
        for (int i = 0; i < n; i++)
        {
           a [i] = 1.0d - random.nextDouble();
           a [i] = -1 * Math.log(a[i]);
           s += a[i];
        }
        for (int i = 0; i < n; i++)
        {
           a [i] /= s;
        }
        return a;
    }

To draw a point uniformly from the N-dimensional unit simplex, we must take a vector of exponentially distributed random variables, then normalize it by the sum of those variables. To get an exponentially distibuted value, we take a log of uniformly distributed value.

kohomologie 2010-01-31 09:14:29

In most languages, you should create `Random` only once, or you will get not random results (and in many cases - the same number over and over). I'm also concerned about the use of `log` - can you please explain why it's there?

Kobi 2010-01-31 09:41:47

+1 for good reference, but I think `nextDouble()` already adjusts for uniform distribution: http://java.sun.com/javase/6/docs/api/java/util/Random.html#nextDouble()

trashgod 2010-01-31 19:29:20

Kobi, thanks for pointing out the `new Random()` thing. As for `log` -- I have edited my post to include more thorough explanation.

kohomologie 2010-01-31 19:50:51

trashgod, nextDouble() returns an uniformly distributed value from [0,1). So when we take n such values, we get a random point in a parallelotope. If we divide this point's coordinates by Manhattan distance between this point and the origin, I'm quite unsure that we'll get an uniformly distributed point in the unit simplex.

kohomologie 2010-01-31 19:51:30

@kohomologie: thank you for clarifying; the algorithm presumes uniform distribution and uses -ln to get the required exponential distribution.

trashgod 2010-02-01 03:56:39

Answer 3

A:

If you want to generate values from a normal distribution efficiently, try the Box Muller Transformation.

James McLeod 2010-01-31 19:23:44

Good tip, but I believe it isn't germane.

dreeves 2010-06-09 18:55:25

ansaurus

tags:

views:

answers:

Generating a probability distribution

related questions