tags:

views:

4032

answers:

5

Hello,

I am playing with Linq to learn about it but I can't figure out how to Distinct when I do not have a simple list (a simple list of integer is pretty easy to do, this is not the question). What if want to distinct a list of Object on ONE or MORE Properties of the object?

Example: If an object is "Person", with Property "Id". How can I get all Person and distinct them by the property Id of the object?

Person1: Id=1, Name="Test1"
Person2: Id=1, Name="Test1"
Person3: Id=2, Name="Test2"

How can I get just Person1 and Person3? Is that possible?

If it's not possible with Linq, what would be the best way to have a list of "Person" depending of some of its Properties in .Net 3.5?

A: 

You can do it (albeit not lightning-quickly) like so:

people.Where(p => !people.Any(q => (p != q && p.Id == q.Id)));

That is, "select all people where there isn't another different person in the list with the same ID."

Mind you, in your example, that would just select person 3. I'm not sure how to tell which you want, out of the previous two.

mquander
A: 

You should be able to override Equals on person to actually do Equals on Person.id. This ought to result in the behavior you're after.

GWLlosa
+11  A: 

What you need is a "distinct-by" effectively. I don't believe it's part of LINQ as it stands, although it's fairly easy to write:

public static IEnumerable<TSource> DistinctBy<TSource, TKey>
    (this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
    HashSet<TKey> seenKeys = new HashSet<TKey>();
    foreach (TSource element in source)
    {
        if (seenKeys.Add(keySelector(element)))
        {
            yield return element;
        }
    }
}

Untested, but it should work (and it now at least compiles).

It assumes the default comparer for the keys though - if you want to pass in an equality comparer, just pass it on to the HashSet constructor.

Jon Skeet
This is a good solution, if you assume that when there are multiple non-distinct values (like in his example) you're aiming to return the first one that you see in the enumeration.
mquander
Yes, that's what I was assuming based on the question. If he'd requested Person2 and Person3 it would be harder :)
Jon Skeet
I think you did an error... the IF statement should not have the !. You want to yield when the element is added..right?
Daok
Yup, fixing thanks.
Jon Skeet
Thx for all your answer today Jon, I am searching the web with your solution and start to understand. On this thread I have learn about Func<TSource, TKey>. Thx
Daok
@Jon: +1 Thanks! I'm always surprised by what you can do with Extension methods, they are like that Static Utility class we all carry around
Chris
+14  A: 

What if want to distinct a list of Object on ONE or MORE Properties of the object?

Simple! You want to group them and pick a winner out of the group.

List<Person> distinctPeople = allPeople
  .GroupBy(p => p.PersonId)
  .Select(g => g.First())
  .ToList();

If you want to define groups on multiple properties, here's how:

List<Person> distinctPeople = allPeople
  .GroupBy(p => new {p.PersonId, p.FavoriteColor} )
  .Select(g => g.First())
  .ToList();
David B
Thx for the example! +1
Daok
A: 

I realize this is a bit late... but I've written an article that explains how to extend the Distinct function so that you can do as follows:

var people = new List<Person>();

people.Add(new Person(1, "a", "b"));
people.Add(new Person(2, "c", "d"));
people.Add(new Person(1, "a", "b"));

foreach (var person in people.Distinct(p => p.ID))
    // do stuff with unique list here.

Here's the article: Extending LINQ - Specifying a Property in the Distinct Function

Timothy Khouri
Your article has an error, there should be a <T> after Distinct:public static IEnumerable<T> Distinct(this...Also it does not look like it will work (nicely) on more that one property i.e. a combination of first and last names.
row1