views:

642

answers:

8

This recent question about sorting randomly using C# got me thinking about the way I've sometimes shuffled my arrays in Perl.

@shuffled = sort { rand() <=> rand() } @array;

The proposed solution in the mentioned question is Fisher-Yates shuffle, which works in a linear time.

The question is: how efficient is my snippet and is such shuffle "really" random?

+3  A: 

This is just intuition, but I think that using a sort like this will produce a set whose order is to some degree dependent on the order of the original set. The results of a truly random sort should not depend at all on the order of the original set. I can't explain why/how, maybe somebody else can (or show that it is in fact random)?

As to how efficient it is, I'm not sure, but it's probably not a whole lot less efficient than any other sort using sort, since AFAIK rand() is relatively cheap. I could be wrong there, though.

Adam Bellaire
I've gathered just as much as you, but they're just hunches as well. I was hoping someone would, at least pretend to, know :)
Tuminoid
@Tim: I don't think the rand() <=> rand() is a shuffle, it is not a correct algorithm. Its results are unpredictable, but not random. That aside, you are re-iterating what I stated. If you are using a truly random sort (a shuffle), then initial ordering is not relevant.
Adam Bellaire
+9  A: 

I'm not a Perl internals expert, so I don't know how "sort" will work here. However, most sort functions expect consistency in their comparisons, and I would expect them to work unpredictably if the function is itself random. Unfortunately, unpredictability is not the same as randomness, and so I'd have no confidence in your shuffled array. It may tend to put elements in some sort of order, just as a hastily created and complicated recurrence relation may not be random.

Rather than analyze the sort function, I'd suggest going with Fisher-Yates.

As Knuth said, randomness is too important to be left to chance.

David Thornley
+8  A: 

I'm actually a bit surprised that your proposed shuffle works. In the implementation of the Perl sort function, it tries to get the elements of the array into ascending order depending on the value of your comparison function. The problem is, your comparison function doesn't return consistent answers! Sometimes it might say "foo" lt "bar", while other times it might say "bar" lt "foo". This has the possibility of confusing the sorting algorithm to the point where it never terminates, or terminates with a fatal error, or some other catastrophic failure.

Greg Hewgill
+4  A: 

The perl documentation on sort says this

The comparison function is required to behave. If it returns inconsistent results (sometimes saying $x[1] is less than $x[2] and sometimes saying the opposite, for example) the results are not well-defined.

So it's a bad idea to do that.

ETA: I just did a benchmark. On a 100000 element array, using a FY-shuffle is more than 10 times faster too.

Leon Timmermans
+3  A: 

There's a better Fisher-Yates shuffle function that doesn't use the sort builtin in the perlfaq4: How do I shuffle an array randomly?.

brian d foy
Kudos for pointing to Perl's awesome documentation. I might add that perldoc -q shuffle from the command line is another way to find the same information.
converter42
+4  A: 

For one thing, you know that no matter the comparator you use sort() can't possibly be faster than O(n log n). So even if the shuffle it performs is fair, its performance will be worse.

So is the shuffle fair? It's obviously not fair for some (easy to analyze) sorting algorithms. Consider a simple bubble sort - in order for an element to move from one end to the other, the comparison function has to evaluate positive for n consecutive calls - a 1 in 2^n probability for what should be a 1 in n event. For quick sort, it's hard to analyze and it's possible that ends up being fair. But if it's important that it be right, do it the right way.

Brian
+8  A: 
$ perldoc List::Util
⋮
  shuffle LIST
       Returns the elements of LIST in a random order

           @cards = shuffle 0..51      # 0..51 in a random order
⋮

That would be my suggestion.

derobert
A: 

@shuffled = map {
  $_->[1]
} sort {
  $a->[0] <=> $b->[0]
} map {
  [ rand(), $_ ]
} @array;
PP