172

7
+7  Q:

## random permutation

I would like to genrate a random permutation as fast as possible. The problem: The knuth shuffle which is O(n) involves generating n random numbers. Since generating random numbers is quite expensive. I would like to find an O(n) function involving a fixed O(1) amount of random numbers.

I realize that this question has been asked before, but I did not see any relevant answers.

Just to stress a point: I am not looking for anything less than O(n), just an algorithm involving less generation of random numbers.

Thanks

+5  A:

Create a 1-1 mapping of each permutation to a number from 1 to n! (n factorial). Generate a random number in 1 to n!, use the mapping, get the permutation.

For the mapping, perhaps this will be useful: http://en.wikipedia.org/wiki/Permutation#Numbering_permutations

Of course, this would get out of hand quickly, as n! can become really large soon.

Yes, this is the optimal algorithm in terms of randomness needed. To be specific about how large: log(n!) is in fact O(n log n) by Stirling's approximation, so we need to generate about n log n bits for this. This of course is optimal; we cannot generate a perfectly uniformly distributed permutation with fewer than log(n!) bits.
Nice solution! I did a quick calculation, however: if we want to store all permutation of an array of 10 integers and assume the size of an integer is 4 bytes. The needed memory would be `10! * 4B * 10 = 138MB`! I don't think this method is usable...
+1 (I had the same idea) but if generating `N` random number is too expensive (somehow), then perhaps `N` is quite huge that generating 1 random number in range of `[1..N!]` may not be `O(1)`. The mapping will likely involve divisions too, an operation that may not be `O(1)` at this scale.
Agree with ShreevatsaR and polygeneL. @Job, you don't have to store all permutations, the mapping could be a procedure, instead of a lookup table.
@ShreevatsaR One random number of magnitude n! is the same 'amount of randomness' as n random numbers of magnitudes n, n-1, ... 1.
@Nick Johnson: Oh you're right! I didn't say anything incorrect, but what you said shows that the Knuth shuffle is *also* optimal. Two things to note: (i) to generate a random number in [1,n], most implementations need to go up to a power of 2 larger than n, so there can be more "waste" with multiple calls to RNG (ii) if you do it stupidly and always generate from a large range and then do modulo, multiple calls can be a waste too. And a *SERIOUS ISSUE*: many RNGs do *not* have lg(N!)≈NlgN bits of state in their seed! If you don't take this into account, all results will be seriously biased!
If your algorithm's behaviour is entirely determined by 32 bits of state, then only 2^32 permutations can ever be produced — the huge number of remaining permutations will *never* be produced. 2^32 is less than even 13!, and 2^64 is less than 21!… you need *at least* 525 bits to generate a random permutation of 100 elements.
Thanks for the great answers. I accept Nick and Shreeva's claim as to the ammount of randomness. I will use the Knuth shuffle then...
+2  A:

Generating a random number takes long time you say? The implementation of Javas Random.nextInt is roughly

``````oldseed = seed;
return (int)(nextseed >>> (48 - bits));
``````

Is that too much work to do for each element?

Perhaps `java.util.Random` is not good enough, and `SecureRandom` is used? But the general idea of your answer is correct: it shouldn't be THAT expensive to generate a random number, even if doesn't use this exact algorithm you quoted.
fine random generators may take longer than simpler one... see e.g. http://www.gnu.org/software/gsl/manual/html_node/Random-Number-Generator-Performance.html
A:

Generate a `32` bit integer. For each index `i` (maybe only up to half the number of elements in the array), if bit `i % 32` is `1`, swap `i` with `n - i - 1`.

Of course, this might not be random enough for your purposes. You could probably improve this by not swapping with `n - i - 1`, but rather by another function applied to `n` and `i` that gives better distribution. You could even use two functions: one for when the bit is `0` and another for when it's `1`.

+1  A:

Not what you asked exactly, but if provided random number generator doesn't satisfy you, may be you should try something different. Generally, pseudorandom number generation can be very simple.

Probably, best-known algorithm
http://en.wikipedia.org/wiki/Linear_congruential_generator

LCG is one of the worst PRNGs. Depending on application (i.e. what this permutation is used for), it may not be appropriate to use LCG. The better ones are slower. They're not THAT slow, however.
Maybe. Or maybe not. But you can't deny that for 99% of tasks LCG and similar algorithms are just fine :)
or xor-shift, this PRNG passes the diehard tests
+1  A:

As other answers suggest, you can make a random integer in the range 0 to N! and use it to produce a shuffle. Although theoretically correct, this won't be faster in general since N! grows fast and you'll spend all your time doing bigint arithmetic.

If you want speed and you don't mind trading off some randomness, you will be much better off using a less good random number generator. A linear congruential generator (see http://en.wikipedia.org/wiki/Linear_congruential_generator) will give you a random number in a few cycles.

+1  A:

Usually there is no need in full-range of next random value, so to use exactly the same amount of randomness you can use next approach (which is almost like random(0,N!), I guess):

``````// ...
m = 1; // range of random buffer (single variant)
r = 0; // random buffer (number zero)
// ...
for(/* ... */) {
while (m < n) { // range of our buffer is too narrow for "n"
r = r*RAND_MAX + random(); // add another random to our random-buffer
m *= RAND_MAX; // update range of random-buffer
}
x = r % n; // pull-out  next random with range "n"
r /= n; // remove it from random-buffer
m /= n; // fix range of random-buffer
// ...
}
``````

P.S. of course there will be some errors related with division by value different from 2^n, but they will be distributed among resulted samples.

+1  A:

Generate N numbers (N < of the number of random number you need) before to do the computation, or store them in an array as data, with your slow but good random generator; then pick up a number simply incrementing an index into the array inside your computing loop; if you need different seeds, create multiple tables.