views:

209

answers:

5

I understand how to do a Distinct() on a IEnumerable and that I have to create an IEqualityComparer for more advanced stuff however is there a way in which you can tell which duplicated item to return?

For example say you have a List<T>

List<MyClass> test = new List<MyClass>();
test.Add(new MyClass {ID = 1, InnerID = 4});
test.Add(new MyClass {ID = 2, InnerID = 4});
test.Add(new MyClass {ID = 3, InnerID = 14});
test.Add(new MyClass {ID = 4, InnerID = 14});

You then do:

var distinctItems = test.Distinct(new DistinctItemComparer());

class DistinctItemComparer : IEqualityComparer<MyClass> {

    public bool Equals(MyClass x, MyClass y) {
        return x.InnerID  == y.InnerID;;
    }

    public int GetHashCode(MyClassobj) {
        return obj.InnerID.GetHasCode();
    }
}

This code will return the classes with ID 1 and 3. Is there a way to return the ID matches 2 & 4.

+2  A: 

This doesn't sound like a job for Distinct, this sounds like a job for Where. You want to filter the sequence in your case:

var ids = new[] { 2, 4 };
var newSeq = test.Where(m => ids.Contains(m.ID));
Andrew Hare
+1. If you have to discern between the items, then clearly they are *distinct*.
Adam Robinson
@Adam: They can be distinct in one aspect but not in another.
Jon Skeet
That's exaclty what I am doing but I am having to do it on InnerID so its returning 4 items. I need skip the first instance of it. I think I may look at this approach - http://stackoverflow.com/questions/1183403/how-to-get-distinct-instance-from-a-list-by-lamba-or-linq/1183877#1183877
Jon
Why did this answer leave me with an image of Distinct and Where as caped superheroes?
Jeff Yates
@Jon (Skeet): Then that seems more like a grouping rather than duplicate items.
Adam Robinson
+1  A: 

No, there's no way.

Distinct() is used to find distinct elements. If you're worried about which element to return...then obviously they are not truly identical (and therefore not distinct) and you have a flaw in your design.

Justin Niessner
+2  A: 

If you want to select one particular of the group of elements that are considered equal using the comparison you use, then you can use group by:

 var q = from t in tests
         group t by t.InnerID into g
         select g.First(...);

In the select clause, you'll get a collection of elements that are equal and you can select the one specific element you need (e.g. using First(...)). You actually don't need to add Distinct to the end, because you're already selecting only a single element for each of the groups.

Tomas Petricek
Once you did the g.Last() how would you then do a select new MyNewClass{...};
Jon
You can write something like `let last = g.Last(...) select new MyNewClass { Sth = last.Whatever }`. The `let` clase allows you to store an immediate result and then you can use it to construct any return type you need in the `select` clause.
Tomas Petricek
+3  A: 

I don't believe it's actually guaranteed, but I'd be very surprised to see the behaviour of Distinct change from returning items in the order they occur in the source sequence.

So, if you want particular items, you should order your source sequence that way. For example:

items.OrderByDescending(x => x.Id)
     .Distinct(new DistinctItemComparer());

Note that one alternative to using Distinct with a custom comparer is to use DistinctBy from MoreLINQ:

items.OrderByDescending(x => x.Id)
     .DistinctBy(x => x.InnerId);

Although you can't guarantee that the normal LINQ to Objects ordering from Distinct won't change, I'd be happy to add a guarantee to MoreLINQ :) (It's the only ordering that is sensible anyway, to be honest.)

Yet another alternative would be to use GroupBy instead - then for each inner ID you can get all the matching items, and go from there.

Jon Skeet
+1 But this only works in this particular example since it just happens that ordering the items produces the results the OP wanted. In other words if the OP wanted the instances where ID=1 and ID=4 this trick wouldn't work.
Andrew Hare
the OP didn't really specify what the exact criteria were for picking the right instances - most of us guessed it was the later ID value within each InnerID
Damien_The_Unbeliever
+1  A: 

You don't want distinct then - you want to group your items and select the "maximum" element for them, based on ID:

    var distinctItems = test.Distinct(new DistinctItemComparer());

    var otherItems = test.GroupBy(a => a.InnerID, (innerID, values) => values.OrderBy(b => b.ID).Last());

    var l1 = distinctItems.ToList();
    var l2 = otherItems.ToList();

l1 = your current list l2 = your desired list

Damien_The_Unbeliever