tags:

views:

66

answers:

3

I found a method to shuffle an array on the internet.

Random rand = new Random();
shuffledArray = myArray.OrderBy(x => rand.Next()).ToArray();

However, I am a little concerned about the correctness of this method. If OrderBy executes x => rand.Next() many times for the same item, the results may conflict and result in weird things (possibly exceptions).

I tried it and everything is fine, but I still want to know whether this is absolutely safe and always works as expected, and I can't find the answer by Google.

Could anyone give me some explanations?

Thanks in advance.

A: 

Using a shufflebag will definitely work.

As for your orderby method, I think that it's not completely random as the order of equal elements is kept. So if you have a random array [5 6 7 2 6] then the elements at the two sixes will always be in the same order.

I'd have to run a frequency test to be sure.

Carra
Thanks for this comment. If I need to do it more seriously, I will consider other methods. Well, at least it needs only a few lines.
LLS
Although `OrderBy` performs a stable sort, the keys are generated per-index, not per-value. So identical values can potentially be shuffled when you generate a random key.
LukeH
+1  A: 

Your approach should work but it is slow.

It works because OrderBy first calculates the keys for every item using the key selector, then it sorts the keys. So the key selector is only called once per item.

In .NET Reflector see the method ComputeKeys in the class EnumerableSorter.

this.keys = new TKey[count];
for (int i = 0; i < count; i++)
{
    this.keys[i] = this.keySelector(elements[i]);
}
// etc...

whether this is absolutely safe and always works as expected

It is undocumented so in theory it could change in future.

For shuffling randomly you can use the Fisher-Yates shuffle. This is also more efficient - using only O(n) time and shuffling in-place instead of O(n log(n)) time and O(n) extra memory.

Related question

Mark Byers
Thank you for your detailed answer and the advice. I didn't know there are similar questions.
LLS
+1  A: 

I assume that you're talking about LINQ-to-Objects, in which case the key used for comparison is only generated once per element. (Note that this is just a detail of the current implementation, and could change, although it's very unlikely to because such a change would introduce the bugs that you mention.)

To answer your more general question: your approach should work, but there are better ways to do it. Using OrderBy will typically be O(n log n) performance, whereas a Fisher-Yates-Durstenfeld shuffle will be O(n):

var shuffledArray = myArray.Shuffle().ToArray();

// ...

public static class EnumerableExtensions
{
    public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> source)
    {
        return source.Shuffle(new Random());
    }

    public static IEnumerable<T> Shuffle<T>(
        this IEnumerable<T> source, Random rng)
    {
        if (source == null) throw new ArgumentNullException("source");
        if (rng == null) throw new ArgumentNullException("rng");
        return source.ShuffleIterator(rng);
    }

    private static IEnumerable<T> ShuffleIterator<T>(
        this IEnumerable<T> source, Random rng)
    {
        T[] buffer = source.ToArray();
        for (int n = 0; n < buffer.Length; n++)
        {
            int k = rng.Next(n, buffer.Length);
            yield return buffer[k];
            buffer[k] = buffer[n];
        }
    }
}

(And it's easy enough, and slightly more efficient, to create equivalent methods to perform an in-place shuffle on IList<T>, if you prefer.)

LukeH
Thanks a lot. Actually I met such bugs in C++.
LLS