tags:

views:

54

answers:

1

I just discovered that Except() will remove all elements in the second list from the first, but it also has the effect that it makes all elements in the returned result distinct.

Simple way around I am using is Where(v => !secondList.Contains(v))

Can anyone explain to me why this is the behavior, and if possible point me to the documentation that explains this?

+2  A: 

The documentation for the Except function states:

Produces the set difference of two sequences by using the default equality comparer to compare values.

The set difference of two sets is defined as the members of the first set that do not appear in the second set.

The important word here is set, which is defined as:

...an abstract data structure that can store certain values, without any particular order, and no repeated values...

Because Except is documented as a set-based operation, it also has the effect of making the resulting values distinct.

Greg Beech
Well that makes a little more sense. Been way to long since I took set math. Thanks.
Stephan
@Stephan - Though, I have to say, the documented behaviour does feel a little weird to me given that `IEnumerable<T>` is a sequence and not a set, so to have a general-purpose extension method for a sequence have set semantics is somewhat strange. But then again, I guess the alternative of having an `Except` method from `IEnumerable<T>` have sequence semantics and one from `ISet<T>` have set semantics is even worse, as `ISet<T>` inherits `IEnumerable<T>` and thus the semantics would differ depending on how the extension method was bound by the compiler.
Greg Beech
@Greg, Mono seems to get this [wrong](http://anonsvn.mono-project.com/viewvc/trunk/mcs/class/System.Core/System.Linq/Enumerable.cs#l794) (this is actually the second Mono Enumerable bug I've found through StackOverflow). It returns elements from first as many times as they appear. My question is, if it's a set difference, why does the documentation show Enumerable.Except preserving the order of first's elements (it also says it returns "A sequence")? Sets don't preserve order. Is Enumerable.Except guaranteed to?
Matthew Flaschen
@Matthew - My guess is that for small sequences as in the docs that order might be preserved as sets are often implemented with a hybrid approach, using a list for small capacity and converted to a bucket implementation as the capacity grows, to give good performance in both cases. But I'd consider that to be an implementation detail rather than a contract, and I wouldn't rely on the output of `Except` being in any particular order as the documentation doesn't state that it will be, and implies that it isn't.
Greg Beech
@Greg I have to agree that it feels strange. That's what threw me off. Calling `Except` on an IEnumerable I incorrectly assumed it would just enumerate and remove values that matched in the second list. I didn't really expect for it to alter the purpose of the enumerable into a set.
Stephan
I've filed [a bug](https://bugzilla.novell.com/show_bug.cgi?id=611821) for Mono.
Matthew Flaschen