If my collection is ordered by date will Distinct() take the first object in list of adjacent duplicates or is it not certain? I am using IEqualityComparer that does not consider date field but I want to be sure the latest date is always taken.
It is not defined which it takes.
In practice it will probably take the first if you are using LINQ to objects but you shouldn't rely on it.
If you access a database it depends on the database version, query plan, etc. Then you really shouldn't be relying on it always returning the first.
If you want this guarantee you could use DistinctBy from morelinq and ask Jon Skeet to guarantee the order for you.
You should use GroupBy
:
from s in whatever
group s by new { s.Field1, s.Field2 } into g
select g.OrderByDescending(o => o.Date).First()
EDIT: You can also use your IEqualityComparer
with GroupBy
:
whatever.GroupBy(
s => s, //Key
g => g.OrderByDescending(o => o.Date).First() //Result
new MyComparer()
);
In the comments of this answer, Marc Gravell and I discuss the Enumerable.Distinct method. The verdict is that order is preserved, but the documentation does not guarantee that this will always work.
Enumerable.Distinct
doesn't define which value is returned - but I can't see how it would be sensible to return anything other than the first one. Likewise although the order is undefined, it's sensible to return the items in the order in which they appear in the original sequence.
I don't normally like relying on unspecified behaviour, but I think it's extremely unlikely that this will change. It's the natural behaviour from keeping a set of what you've already returned, and yielding a result as soon as you see a new item.
If you want to rely on this unspecified behaviour, you should order the items by date (descending) before using Distinct
. Alternatively, you could use a grouping and then order each group appropriately.