views:

195

answers:

2

Would you say modern version of fisher yates is the most unbiased shuffling algorithm? How would you explain that each element in the array has a probability of 1/n being in its original spot?

+1  A: 

the (Modern, aka "Knuth") Fisher–Yates shuffle is

  • relatively simple to implement
  • fairly efficient O(n) for time and O(1) or indeed O(0) for space
  • unbiased (every permutation is equiprobable)
  • well known / well understood, proven, tested.

What else could we want out of an algorithm (well, yeah, when the number of permutations grows huge, one may try something else, but most cases do not involve such huge counts) ?

Edit: ' just noticed that this answer responds to the title of the question, not its content. (Which is why it is good to have these two parts of the question to match better...)
In a nutshell, the shuffle will be as random as the particular RNG used to implement the algorithm.
An intuitive explanation is that for an array with m element, even though as n, the decreasing control variable of the loop goes down towards 1, the possible cells where the cell at position n may be swapped with diminishes, the probability that this very cell has readily been moved increases in the exact same proportion. In other words, the last element of the array could end-up anywhere in the array, but it has only one chance to be moved (upon the very first iteration). The second to last element to be moved has one less place to go but there is a probability of 1/m that it may readily have been been moved during the very first iteration. etc.

mjv
+3  A: 

Given a perfect pseudo-random number generator (the Mersenne Twister is very close), the Fisher-Yates algorithm is perfectly unbiased in that every permutation has an equal probability of occurring. This is easy to prove using induction. The Fisher-Yates algorithm can be written recursively as follows (in Python syntax pseudocode):

def fisherYatesShuffle(array):
    if len(array) < 2:
        return

    firstElementIndex = uniform(0, len(array))
    swap(array[0], array[firstElementIndex])
    fisherYatesShuffle(array[1:])

Each index has an equal probability of being selected as firstElementIndex. When you recurse, you now have an equal probability of choosing any of the elements that are still left.

Edit: The algorithm has been mathematically proven to be unbiased. Since the algorithm is non-deterministic, the best way to test whether an implementation works properly is statistically. I would take an array of some arbitrary but small size, shuffle it a bunch of times (starting with the same permutation as input each time) and count the number of times each output permutation occurs. Then, I'd use Pearson's Chi-square Test to test this distribution for uniformity.

dsimcha
Thanks dsimcha. How would you test a fisher yates shuffle implementation and prove it is unbiased programatically? Rather my question to you is what is the best way of testing it?
Phoenix