tags:

views:

103

answers:

5

In this question one of the suggestions is to sort a list by Random.Next().

I assume (maybe incorrectly) he's suggesting this

    public static IEnumerable<T> RandomSort<T>(this IEnumerable<T> items)
    {
        Random r = new Random();
        var a = items.ToArray();
        Array.Sort(a, (t1, t2) => (r.Next()%2==0)?-1 : 1);
        return a;
    }

(Yes, there already is an Array.RandomShuffle function which you would obviously use instead. That's not the question)

EDIT: The poster has clarified the answer. He was suggesting the use of the OrderBy clause, which would obviously be safe

The question is, is the above code (Using Array.Sort()) safe to run?

My issue is that it will be breaking a fundamental law of sorting predicates:

if (a < b) and (b < c) then (a < c)

It's not even guaranteeing that if (a < b) then ( a < b) the next time you ask.

Would this would take you into "undefined behaviour" territory?

For example, it could crash or fall into an infinite loop depending upon the sequence of numbers that Random() returns?

A: 

this is basically a good way to randomize your list. it works, but is kinda haxor.

Muad'Dib
A: 

Just use a random shuffle.

http://www.techtalkz.com/c-c-sharp/66613-shuffle-collection.html

Hamish Grubijan
+2  A: 

This is a useful device for creating a random permutation of the list. For a given permutation it is absolutely true that if a is before b and b is before c then a is before c.

Another way to think of it is like this: if you seed a random number generator with the same seed each time then it will always produce the same ordering. So you can think of every seed of the random number generator as producing a (possibly) different ordering of the list.

It's not even guaranteeing that if (a < b) then ( a < b) the next time you ask.

That's fine. But as explained above if we seed the random number generator with the same seed and present it to Array.Sort as in your sample code in the same state, it will produce the same ordering.

Jason
A: 

the way it looks like it works is it's only asked once, then every subsequent query on that value it's the same.

it would be like adding an extra column to a table, filling each field with a random number, then ordering by that column.

John Boker
A: 

You are correct that using a random number function for sorting is an easy way to break the formal requirements typically imposed on sorting functions, including the ones you mentioned.

The current implementation of LINQ seems to compute all of the sort keys up front. The following code demonstrates that the sort keys are obtained sequentially and exactly once. This is likely a wise implementation since it avoids the problematic scenarios you are worried about.

Nonetheless, I would not rely on this behavior (it is not documented in MSDN for Enumerable.OrderBy.

    public void Test()
    {
        int[] nums = Enumerable.Range(1, 100).ToArray();
        var sorted = from i in nums orderby Display(i) select i;
        sorted.ToArray();
    }

    private static int Display(int i)
    {
        Console.WriteLine(i);
        return i;
    }

(The above code prints 1 to 100 sequentially, showing how orderby evaluates the sort keys.)

binarycoder
Please state the formal requirements of a sorting function that you believe are broken by the above methodology of producing an ordering of a list.
Jason
See the documentation for IComparable<T>.CompareTo (http://msdn.microsoft.com/en-us/library/43hc6wht.aspx). It is also usually expected that the function is deterministic (same input implies same output), although that is not stated on that page. In the context of LINQ, it is not clear what is expected because it is not documented. LINQ seems to avoid the problems by computing all the sort keys up front, but I could not find this behavior documented in MSDN.
binarycoder
Sorting (and even `Array.Sort`) existed long before `IComparable<T>.CompareTo` and LINQ so they are not germane to this discussion. And the above does produce the same output given identical input; give `Array.Sort` the same list and random number generator in the same state twice (or thrice or more) and will produce the same output every time.
Jason
<same list and random number generator> No argument is being passed to r.Next(). Since it does not return a constant, it is not deterministic with respect to its arguments. It is sensitive to the order and number of times LINQ evaluates the sort keys. I have demonstrated that LINQ evaluates the sort keys in a way that avoids problems with this (that is, all are evaluated up front and only once), but AFAIK this is an undocumented part of the LINQ implementation. For formal properties, research partial order and total order (http://en.wikipedia.org/wiki/Total_order).
binarycoder
Again, if you provide `Array.Sort` with the same list and a random number generator with the same state, `Array.Sort` will produce the same output.
Jason
<Again, if you provide>How are you sure that LINQ will not call it more than once for the same item? Lots of sort algorithms are commonly implemented this way. If it is called more than once, it will return a different value. This could cause undefined behavior, including an exception or hang depending on the implementation. The sort algorithm rightfully expects it to return the same answer for the same item every time. LINQ appears to do extra work up front to avoid this issue, but it is undocumented.
binarycoder