views:

258

answers:

5

Given a function R which produces true random 32 bit numbers, I would like a function that returns random integers in the range 0 to n, where n is arbitrary (less than 2^32).

The function must produce all values 0 to n with equal probability.

I would like a function that executes in constant time with no if statements or loops, so something like the Java Random.nextInt(n) function is out.

I suspect that a simple modulus will not do the job unless n is a power of 2 -- am I right?


I have accepted Jason's answer, despite it requiring a loop of undetermined duration, since it appears to be the best method to use in practice and essentially answers my question. However I am still interested in any algorithms (even if less efficient) which would be deterministic in nature and be guaranteed to terminate, such as Mark Byers has pointed to.

+3  A: 

What you're asking for is impossible. You can't partition 2**32 numbers into three sets of exactly equal size.

If you want to guarantee an absolutely perfect uniform distribution in 0 <= x < n, where n is not a power of 2 then you have to be prepared to call R potentially an infinite number of times. In reality you will typically need only one or two calls, but the code has to in theory be able call R any number of times otherwise it can't be completely uniform.

Mark Byers
Can you point me to such a solution, which does not discard any information?
invariant
A closer to optimal solution is to bisect the interval [0,n) repeatedly until it lies fully inside [x,x+1) for some 0<=x<n. Then x is the result. At each step, you consider the interval [a,b) and take one new random bit. If the bit is 0, the new interval is [a,(a+b)/2), otherwise [(a+b)/2),b). It can be implemented without using floating point arithmetic. I don't know the name of this algorithm, but I'm sure it has one.
Mark Byers
Thank you, I will investigate this.
invariant
Thinking about it some more I don't think it's possible to design any algorithm that can in a finite number of steps (even a large number) random uniformly select a number in [0, 1, 2] by using a random function R than returns a uniform int from [0, 2**32). The fundamental problem is that you can't make something have probability 1/3 by adding a finite number of events that have probability 1/2**(32*x) for any integer x.
Mark Byers
A: 

I don't understand why modulus wouldn't do what you want? Since R is a function that produces true random 32 bit numbers, that means that each number has the same probability to be produced, right? So, if you use a modulus n:

randomNumber = R() % (n + 1) //EDITED: n+1 to return values from 0-n

then each number from 0 to n has the same probability!

Alex
No, think about a very simple example. Think of a source `S` that produces 2-bit random numbers (0, 1, 2, 3) all with equal likelihood. Now use modulus to produce a uniform distribution on {0, 1, 2}. So we would consider `S % 3`. But `0` and `3` both map to `0` so we actually have that `0` is produced 50% of the time, and `1` and `2` are produced 25% of the time each.
Jason
A modulus will return values from 0 to n-1, not 0 to n...
Chinmay Kanchi
@Jason +1 Excellent comment
Kristopher Ives
+7  A: 

Without discarding some of the values from the source, you can not do this. For example, a set of size 2^32 can not be partitioned into three equally sized sets. Therefore, it is impossible to do this without discarding some of the values and iterating until a non-discarded value is produced.

So, just use this (pseudocode):

rng is random number generator produces uniform integers from [0, max)
compute m = max modulo (n + 1)
do {
    draw a random number r from rng
} while(r >= max - m)
return r modulo (n + 1)

Effectively I am throwing out the top part of the distribution that causes problems. If rng is uniform on [0, max), then this algorithm will be uniform on [0, n]

Jason
What is the criterion for discarding a value?
invariant
@invariant: See the pseudocode that I provided. Think of it like this. You have some source of randomness uniformly producing values in `{0, 1, 2, ... ,max - 1}`. You want to use it to uniformly produce random values in `{0, 1, 2 ..., n}`. The naive approach to take `r % (n + 1)` where `r` is drawn from the random number generator does not work because `max` might not be divisible by `n + 1`. So, we cut off that excess (the excess is the remainder of `max` divided by `n+1`) by looping until `r` is less than `max - m`. Now it is safe to use modulo.
Jason
I understand now, with the pseudocode, thanks. This appears to be exactly the way the Java nextInt() function does it too. I had thought that there may be a determininstic way to do this instead but clearly not!
invariant
Note that this isn't the most efficient way (e.g. for n = max/2+1 it can take far more calls to R on average than an optimal solution which doesn't discard any information), but it works fine for all practical purposes.
Mark Byers
Ouch... if performance is critical, that n = max/2+1 worst case could really suck. Modern processors generally have good branch prediction, but long latency penalties for a mispredicted branch. So a near-even split with (pseudo-)random numbers is the worst case. Is there anything that can be done to fix this case?
comingstorm
A: 

You can generate two 32 bit numbers and put them together to form 64 bit number. Worst case scenario can be than biased by 0.99999999976716936 if you do not discharge numbers (if you need number whit no more than 32 bits) that mean that some number have by this factor lower probability than other.

But if you still want to remove this small bias you will have low ration "out of range" hits and in that case more that 1 discharge.

ralu
A: 

Depending upon your problem/use of the random numbers, maybe you could pre-allocate your random numbers using a slow method and put them into a simple array. Then getNextRnd() can just return the next in the array.

Quick, fixed time call, no branches, just wasting memory (which is usually pretty cheap) and process initialization time.

DaveC
Isn't this just buffering the issue?
Kristopher Ives