ansaurus

Question

In LINQ, does orderby() execute the comparing function only once or execute it whenever needed?

Answer 1

A:

Using a shufflebag will definitely work.

As for your orderby method, I think that it's not completely random as the order of equal elements is kept. So if you have a random array [5 6 7 2 6] then the elements at the two sixes will always be in the same order.

I'd have to run a frequency test to be sure.

Carra 2010-10-18 09:54:23

Thanks for this comment. If I need to do it more seriously, I will consider other methods. Well, at least it needs only a few lines.

LLS 2010-10-18 10:09:38

Although `OrderBy` performs a stable sort, the keys are generated per-index, not per-value. So identical values can potentially be shuffled when you generate a random key.

LukeH 2010-10-18 10:29:06

Answer 2

+1 A:

Your approach should work but it is slow.

It works because OrderBy first calculates the keys for every item using the key selector, then it sorts the keys. So the key selector is only called once per item.

In .NET Reflector see the method ComputeKeys in the class EnumerableSorter.

this.keys = new TKey[count];
for (int i = 0; i < count; i++)
{
    this.keys[i] = this.keySelector(elements[i]);
}
// etc...

whether this is absolutely safe and always works as expected

It is undocumented so in theory it could change in future.

For shuffling randomly you can use the Fisher-Yates shuffle. This is also more efficient - using only O(n) time and shuffling in-place instead of O(n log(n)) time and O(n) extra memory.

Related question

C#: Is using Random and OrderBy a good shuffle algorithm?

Mark Byers 2010-10-18 09:55:32

Thank you for your detailed answer and the advice. I didn't know there are similar questions.

LLS 2010-10-18 10:07:03

Answer 3

+1 A:

I assume that you're talking about LINQ-to-Objects, in which case the key used for comparison is only generated once per element. (Note that this is just a detail of the current implementation, and could change, although it's very unlikely to because such a change would introduce the bugs that you mention.)

To answer your more general question: your approach should work, but there are better ways to do it. Using OrderBy will typically be O(n log n) performance, whereas a Fisher-Yates-Durstenfeld shuffle will be O(n):

var shuffledArray = myArray.Shuffle().ToArray();

// ...

public static class EnumerableExtensions
{
    public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> source)
    {
        return source.Shuffle(new Random());
    }

    public static IEnumerable<T> Shuffle<T>(
        this IEnumerable<T> source, Random rng)
    {
        if (source == null) throw new ArgumentNullException("source");
        if (rng == null) throw new ArgumentNullException("rng");
        return source.ShuffleIterator(rng);
    }

    private static IEnumerable<T> ShuffleIterator<T>(
        this IEnumerable<T> source, Random rng)
    {
        T[] buffer = source.ToArray();
        for (int n = 0; n < buffer.Length; n++)
        {
            int k = rng.Next(n, buffer.Length);
            yield return buffer[k];
            buffer[k] = buffer[n];
        }
    }
}

(And it's easy enough, and slightly more efficient, to create equivalent methods to perform an in-place shuffle on IList<T>, if you prefer.)

LukeH 2010-10-18 09:55:55

Thanks a lot. Actually I met such bugs in C++.

LLS 2010-10-18 10:05:57

ansaurus

tags:

views:

answers:

In LINQ, does orderby() execute the comparing function only once or execute it whenever needed?

related questions