views:

68

answers:

3

I would like to confirm this, I was trying to sort a List of my class using Linq. But it seems the order of the data was not ordering the same way when i used a sort function.

Assume that the list contains 4 ComputeItem and all their A are set to 1, the B, C, D of all are set to zero.

CASE 1:

ItemList =
    ItemList
        .OrderByDescending(m => m.A)
        .ThenBy(m => m.B)
        .ThenBy(m => m.C)
        .ThenBy(m => m.D)
        .ToList<ComputeItem>();

versus

CASE 2:

ItemList.Sort(
    delegate(ComputeItem item1, ComputeItem item2)
    {
        if (item1.A == item2.A)
        {
            if (item1.B == item2.B)
            {
                if (item1.C == item2.C)
                {
                    return item1.D - item2.D;
                }
                else
                {
                    return item1.C - item2.C;
                }
            }
            else
            {
                return item1.B - item2.B;
            }
        }
        else
        {
            return item2.A - item1.A;
        }
    }
);

The result of the first sort is it did not move anything.
The result of the second sort is it sorted it to a different order.
Orignial Order [1, 2, 3, 4]
CASE 1 new Order [1, 2, 3, 4]
CASE 2 new Order [3, 4, 1, 2]

Now the problem is before I was using CASE2 and am trying to migrate it to CASE 1. But the behavior cannot change drastically than before. Any idea why the CASE 2 moved the ordering?

+2  A: 

is this right?

return item2.A - item1.A;

or should it be

return item1.A - item2.A;

Rahul
item2.A - item1.A was done so that it could be sorted in descending order...
Nassign
@Rahul: Your proposed change would cause the OP's two cases to no longer use identical comparisons (it would be the same as `OrderBy` on the first line instead of `OrderByDescending`).
Dan Tao
+3  A: 

I think Rahul's hit the nail on the head. Personally I'd find something like the following more readable for your case 2:

int result;

// Sort by A descending => invert the sign of the result
result = - item1.A.CompareTo(item2.A);
if (result != 0) return result;

// then by B ascending
result = item1.B.CompareTo(item2.B);
if (result != 0) return result;

// then by C ascending
result = item1.C.CompareTo(item2.C);
if (result != 0) return result;

// then by D ascending
result = item1.D.CompareTo(item2.D);
if (result != 0) return result;

// ... add other comparisons here if you ever need to in the future

return result;
Joe
yes. this is more readable. but i think Rahul's just pointed out the order or operator. I intensionally swapped then for the descending order. So in the case of your version it should be item2.A.CompareTo(item1.A)
Nassign
"So in the case of your version it should be item2.A.CompareTo(item1.A) " - yes you're right. I'll update it..
Joe
+5  A: 

The sorting algorithm used by OrderBy, OrderByDescending, ThenBy, and ThenByDescending is a stable QuickSort. From the MSDN documentation:

This method performs a stable sort; that is, if the keys of two elements are equal, the order of the elements is preserved.

List<T>.Sort uses an unstable version of QuickSort, which does not necessarily preserve the original ordering of equal elements. Again, from the MSDN documentation:

This method uses Array.Sort, which uses the QuickSort algorithm. This implementation performs an unstable sort; that is, if two elements are equal, their order might not be preserved.

This explains the discrepancy you are seeing. Obviously, the end result of both will be that the items are ordered in the manner dictated by your comparison mechanism. It's only the order in which equal elements appear that is unspecified for List<T>.Sort.

The good news is that you're going from an unstable sort to a stable sort. I have trouble imagining that this could be a breaking change for you (what kind of software requires that a sort be unstable?). If anything it should make your program that much more predictable.

Dan Tao
This is my first time encountering stable and unstable Quicksort. But that makes sense.
Nassign
@Dan You are right. Using stable sort is more predictable.
Nassign
Backwards compatibility. There are plenty of scenarios where you'd want the user to get the exact same result from the same input, using a different implementation. Stable vs. unstable sort doesn't permit this. Even though the unstable sort doesn't guarantee behaviour, for a given set of inputs it's very likely to give a consistent result each time - which the user will have come to expect, and possibly require.
Tom W
@Tom W: Fair point. Obviously I can't know exactly what the OP's situation is; maybe my language at the end there is a bit too optimistic considering. But I would still be inclined to think that, in most cases, moving from a stable sort to an unstable sort would be *far* more likely to be disruptive to the user than moving in the opposite direction. But sure, reliance on a specific outcome from an unstable sort is possible. I think you may be overstating the likelihood of that, though. Then again, maybe I'm understating it.
Dan Tao