views:

495

answers:

2

I'd like to generate uniformly distributed random integers over a given range. The interpreted language I'm using has a builtin fast random number generator that returns a floating point number in the range 0 (inclusive) to 1 (inclusive). Unfortunately this means that I can't use the standard solution seen in another SO question (when the RNG returns numbers between 0 (inclusive) to 1 (exclusive) ) for generating uniformly distributed random integers in a given range:

result=Int((highest - lowest + 1) * RNG() + lowest)

The only sane method I can see at the moment is in the rare case that the random number generator returns 1 to just ask for a new number.

But if anyone knows a better method I'd be glad to hear it.

Rob

NB: Converting an existing random number generator to this language would result in something infeasibly slow so I'm afraid that's not a viable solution.

Edit: To link to the actual SO answer.

A: 

I don't see why the + 1 is needed. If the random number generator delivers a uniform distribution of values in the [0,1] interval then...

result = lowest + (rng() * (highest - lowest))

should give you a unform distribution of values between lowest

rng() == 0, result = lowest + 0 = lowest

and highest

rng() == 1, result = lowest + highest - lowest = highest

Including + 1 means that the upper bound on the generated number can be above highest

rng() == 1, result = lowest + highest - lowest + 1 = highest + 1.

The resulting distribution of values will be identical to the distribution of the random numbers, so uniformity depends on the quality of your random number generator.

Following on from your comment below you are right to point out that Int() will be the source of a lop-sided distribution at the tails. Better to use Round() to the nearest integer or whatever equivalent you have in your scripting language.

Simon
This isn't correct e.g lowest=0 and highest=9. You would only output 9 when the RNG=1 but would output 0 when the output of the RNG lies between 0 and 0.1
RobS
but it is not correct to have the +1 either, since when it is 1 (which should happend with uniform random probability, the same as all values between 0 and 0.1) then the answer is not 9, but 10. It seems to me that what you need is not integerising the value but rounding it to the nearest integer.
Simon
The standard solution is only correct if the RNG returns numbers from O (inclusive) to 1 (exclusive). I'll edit the question to make this clearer.
RobS
you are presuming the problem of the lop-sided tails needa a +1 and a frig when rng()==1. The problem comes not from that but from Int() which rounds down. If you round to the nearest number then the solution I posted above works precisely. The "standard" solution in the other post is wrong.
Simon
If you have a RNG that delivers numbers in the range [0,1) then the standard solution is correct. If you don't believe me then please try it out. The standard solution is, however, incorrect when the RNG delivers numbers in the range [0,1].
RobS
Sorry, didn't spot the ) bracket. However, despite that, I think my solution stands if you use Round() rather than Int(). Your rng() never produces precisely 1, but it will produce 0.95 and above which when rounded to the nearest integer will give you "highest" without the need for +1
Simon
I'm afraid my RNG does produce precisely 1, if it didn't then I wouldn't have a problem! The round solution doesn't work either I'm afraid. e.g. when lowest=0 and highest=9 then 0<=RNG<0.05 maps to 0 but 0.05<=RNG<.15 maps to 1.
RobS
+1  A: 

Presumably you are desperately interested in speed, or else you would just suck up the conditional test with every RNG call. Any other alternative is probably going to be slower than the branch anyway...

...unless you know exactly what the internal structure of the RNG is. Particularly, what are its return values? If they're not IEEE-754 floats or doubles, you have my sympathies. If they are, how many real bits of randomness are in them? You would expect 24 for floats and 53 for doubles (the number of mantissa bits). If those are naively generated, you may be able to use shifts and masks to hack together a plain old random integer generator out of them, and then use that in your function (depending on the size of your range, you may be able to use more shifts and masks to avoid any branching if you have such a generator). If you have a high-quality generator that produces full quality 24- or 53-bit random numbers, then with a single multiply you can convert them from [0,1] to [0,1): just multiply by the largest generatable floating-point number that is less than 1, and your range problem is gone. This trick will still work if the mantissas aren't fully populated with random bits, but you'll need to do a bit more work to find the right multiplier.

You may want to look at the C source to the Mersenne Twister to see their treatment of similar problems.

kquinn
For now I'm implementing the conditional on each call. I just wondered if there was a better way. I had thought about multiplying by the largest possible float that isn't 1 but I think this might shift the smallest possible float that isn't 0 to 0 and creating a non-uniform distribution.
RobS
And thanks for the other info, I'll look into it.
RobS
Yes, it's possible that such tricks will affect randomness at the 2^-53 level. But, you are converting the result to an integer anyway, so it's quite unlikely the final distribution will change. And if you truly need perfection at the 2^-53 level, you probably want a different language anyway.
kquinn