views:

116

answers:

2

I've found myself running back through some old 3.5 framework legacy code, and found some points where there are a whole bunch of lists and dictionaries that must be updated in a synchronized fashion. I've determined that I can make this process infinitely easier to both utilize and understand by converging these into custom container classes of new custom classes. There are some points, however, where I came to concerns with organizing the contents of these new container classes by a specific inner property. For example, sorting by the ID number property of one class.

As the container classes are primarily based around a generic List object, my first instinct was to write the inner classes with IComparable, and write the CompareTo method that compares the properties. This way, I can just call items.Sort() when I want to invoke the sorting.

However, I've been thinking instead about using items = items.OrderBy(Func) instead. This way it is more flexible if I need to sort by any other property. Readability is better as well, since the property used for sorting will be listed in-line with the sort call rather than having to look up the IComparable code. The overall implementation feels cleaner as a result.

I don't care for premature or micro optimization, but I like consistency. I find it best to stick with one kind of implementation for as many cases as it is appropriate, and use different implementations where it is necessary. Is it worth it to convert my code to use the LINQ OrderBy instead of using List.Sort? Is it a better practice to stick with the IComparable implementation for these custom containers? Are there any significant mechanical advantages offered by either path that I should be weighing the decision on? Or is their end-functionality equivalent to the point that it just becomes coder's preference?

+3  A: 

The major point here is that List<T>.Sort() does the sorting in place. If your list is exposed to external code, it will always represent the same object to this code. This is important if the list is kept in a field by code outside of the container class. If you're sorting with OrderBy(), you'll get a new enumeration each time, replacing the previous items. Any previously stored list will not represent the current state of your class.

Considering performance, OrderBy will have to iterate through the whole list to sort items. Then you will call ToList() to create the new list from this enumeration, iterating through the list a second time. Plus, since it's an enumeration, List will use the doubling algorithm, increasing its size until every element can fit into it. In case of a large list, that could be quite a few allocations and memory copying. I would expect performance to be much worse than List<T>.Sort().

Edit: Small benchmark:

internal class Program {

    private static List<int> CreateList(int size) {

        // use the same seed so that every list has the same elements
        Random random = new Random(589134554);

        List<int> list = new List<int>(size);
        for (int i = 0; i < size; ++i)
            list.Add(random.Next());
        return list;
    }

    private static void Benchmark(int size, bool output = true) {
        List<int> list1 = CreateList(size);
        List<int> list2 = CreateList(size);

        Stopwatch stopwatch = Stopwatch.StartNew();
        list1.Sort();
        stopwatch.Stop();
        double elapsedSort = stopwatch.Elapsed.TotalMilliseconds;
        if (output)
            Console.WriteLine("List({0}).Sort(): {1}ms (100%)", size, elapsedSort);

        stopwatch.Restart();
        list2.OrderBy(i => i).ToList();
        stopwatch.Stop();
        double elapsedOrderBy = stopwatch.Elapsed.TotalMilliseconds;
        if (output)
            Console.WriteLine("List({0}).OrderBy(): {1}ms ({2:.00%})", size, elapsedOrderBy, elapsedOrderBy / elapsedSort);

    }

    internal static void Main() {

        // ensure linq library is loaded and initialized
        Benchmark(1000, false);

        Benchmark(10);
        Benchmark(100);
        Benchmark(1000);
        Benchmark(10000);
        Benchmark(100000);
        Benchmark(1000000);

        Console.ReadKey();
    }
}

Output (normalized to List.Sort):

List(10).Sort(): 0,0025ms (100%)
List(10).OrderBy(): 0,0157ms (628,00%)
List(100).Sort(): 0,0068ms (100%)
List(100).OrderBy(): 0,0294ms (432,35%)
List(1000).Sort(): 0,0758ms (100%)
List(1000).OrderBy(): 0,3107ms (409,89%)
List(10000).Sort(): 0,8969ms (100%)
List(10000).OrderBy(): 4,0751ms (454,35%)
List(100000).Sort(): 10,8541ms (100%)
List(100000).OrderBy(): 50,3497ms (463,88%)
List(1000000).Sort(): 124,1001ms (100%)
List(1000000).OrderBy(): 705,0707ms (568,15%)
Julien Lebosquain
Well, the volume indicated by your benchmarks speaks... volumes. At rates like that, I can even squeeze in 3 Sorts to replace that one point I had used 3 OrderBys, and get better performance than if I had used just one OrderBy! Sort() it is!
ccomet
One quick follow-up, judging by your calculations... would this also mean that `items.First()` is significantly slower than `items[0]`?
ccomet
No, `First()` is optimized for `IList<T>` implementations and will return `list[0]`. I expect performance to be the same. Even if it was directly using the enumeration, it's still be only one item to iterate so it will probably be a micro optimization.
Julien Lebosquain
Alright, that makes sense. Also, some further testing... calling .Sort() three times, even specifying separate IComparisons as SchlaWiener demonstrates, is faster than calling OrderBy().ThenBy().ThenBy().
ccomet
+1  A: 

As Julien said, Sort() will probably be faster, however since you mentioned the better readability and flexibility of OrderBy(), you can achieve that with your Sort() method, too (at least if the Property you want to base your sort on is comparable)

items = items.OrderBy(x => x.Name).ToList();
items.Sort((x,y) => x.Name.CompareTo(y.Name)); // If x.Name is never null
items.Sort((x,y) => String.Compare(x.Name, y.Name)); // null safe
SchlaWiener
One of these days, I'll stop forgetting the extra little things like "I can specify the Sort function instead of relying on IComparable". Which is great, because this allows me to avoid specifying certain classes as being "comparable" when they don't really have that feel, they just need to be sorted in certain situations. I've combined your answer with the data of Julien to make my success. Many thanks!
ccomet