views:

162

answers:

2

I would like to compare lists of elements of a given type, to see which list is "bigger".

new BuiltInComparer<IEnumerable<int>>().Compare(
    new[] {3,2,3}, 
    new[] {1,2,3})

...would return 1

new BuiltInComparer<IEnumerable<int>>().Compare(
    new[] {1,2,3}, 
    new[] {1,2,4})

...would return -1 etc

Is there any such built in comparer?

+5  A: 

I don't think there's anything built into the framework - and as Eric says, you haven't provided the comparison criteria. If you mean "compare element-wise in the natural way, and assume a 'missing' element is smaller than any present element" (i.e. a longer sequence beats a shorter subsequence if they're equal where possible) then something like this would do it:

public int SequenceCompare<T>(IEnumerable<T> source1, IEnumerable<T> source2)
{
    // TODO: Parameter validation :)
    // You could add an overload with this as a parameter
    IComparer<T> elementComparer = Comparer<T>.Default;       
    using (IEnumerator<T> iterator1 = source1.GetEnumerator())
    using (IEnumerator<T> iterator2 = source2.GetEnumerator())
    {
        while (true)
        {
            bool next1 = iterator1.MoveNext();
            bool next2 = iterator2.MoveNext();
            if (!next1 && !next2) // Both sequences finished
            {
                return 0;
            }
            if (!next1) // Only the first sequence has finished
            {
                return -1;
            }
            if (!next2) // Only the second sequence has finished
            {
                return 1;
            }
            // Both are still going, compare current elements
            int comparison = element.Compare(iterator1.Current,
                                             iterator2.Current);
            // If elements are non-equal, we're done
            if (comparison != 0)
            {
                return comparison;
            }
        }
    }
}
Jon Skeet
@JonSkeet Nice. Offtopic: why use the using's ? What's there to dispose? Is it for the case where, let's say, we're enumerating over something pulled from a database, or does this have value even for plain old lists?
Cristi Diaconescu
@Cristi: IEnumerator<T> implements `IDisposable`. This fact alone is enough to wrap it in a `using` statement. Whether there is something to dispose or not, but in Jon's example, the function accepts `IEnumerable<T>` arguments, which are interfaces. There is therefore no way to know whether there is anything to dispose or not. But it is very well possible that the used enumerator calls to the database. Not disposing it could leave the connection open.
Steven
@Cristi: It's very important to dispose of `IEnumerator<T>` values. That's the way that finally blocks get to be executed in iterator blocks - for example, if you have an iterator which reads lines from a file, disposing of the iterator should close the file.
Jon Skeet
It's to support the poorly contrived notion of try { yield return x; } finally { } -- Well once bitten, twice shy, I don't like yield for this very reason.
csharptest.net
csharptest.net: I like `using` for this very reason.
Ken
A: 

If you're on .NET 4 (and it doesn't sound like you are), I think you might be able to do something clever with Enumerable.Zip. Something like:

var r = x.Zip(y, comparer.Compare).FirstOrDefault(c => c != 0);

though I can't see right now how to efficiently deal with the case where the shorter one is the same as the longer one, as far as it goes.

Edit: If you're only comparing arrays (or otherwise don't care about measuring your collections twice), then I think you can simply add:

if (r == 0) {
    r = int.Compare(x.Count(), y.Count());
}

You could even combine these as:

var r = x.Zip(y, comparer.Compare)
         .Concat(new [] { int.Compare(x.Count(), y.Count()) })
         .FirstOrDefault(c => c != 0)

(And if you're on .NET 3.5, then add a Zip extension method, because it's easy to write and seriously useful all over the place! I don't know why it wasn't included in the initial Linq release.)

Ken
Re: why no zip in the initial release: (1) even small methods have testing, documentation, etc, burden and we were very short on time for that release, (2) we were considering adding a special syntax for zip joins to the query comprehension syntax. Ultimately we decided not to, but it would have looked pretty foolish to write a method and then write a zip join query syntax that was *incompatible* with that method for some reason. It's a good idea to design these things all at once if you can.
Eric Lippert
Of course any feature needs work to release. My comment was more to wonder why other methods made it in, and not Zip. There's methods from Linq's initial release I've still never even seen used once, anywhere. :-)
Ken