tags:

views:

239

answers:

6

I have 2 List objects:

List<int> lst1 = new List<int>();
List<int> lst2 = new List<int>();

Let's say they have values:

lst1.Add(1);
lst1.Add(2);
lst1.Add(3);
lst1.Add(4);

lst2.Add(1);
lst2.Add(4);

I need to get an object containing the "distinct" list of both of these; so in this case the return would be List {2, 3}.

Is there an easy way to do this? Or do I need to iterate through each value of the lists and compare?

I am open to using ObjectQuery, LINQ, etc as these lists are coming from a database, and could potentially be several hundred to several thousand entries long.

Thanks!

+1  A: 

If the results are coming from database it would be better to process them there. It will be much faster compared to in memory operation especially if there are lots of results.

If you have to process them in your code you can use i4o - Indexed LINQ to make it faster.

Giorgi
+5  A: 

EDIT: thanks to the comments you need to do some extra work besides just using Except to get a symmetric difference. If an additional value is added to the 2nd list Except alone would be incorrect. To get the proper result try this:

var list1 = new List<int>(Enumerable.Range(1,4));
var list2 = new List<int> { 1, 4, 6 };

var result = list1.Except(list2).Union(list2.Except(list1));

The above returns {2, 3, 6}.

Note that you'll need to add a ToList() if you really need a List<int>, otherwise the above operation will return an IEnumerable<int>.


Use the Enumerable.Except method, which produces the set difference of two sequences:

var result = lst1.Except(lst2);
Ahmad Mageed
It works on this example, but `Except` gives set difference, not symmetric difference. In particular, `lst2.Except(lst1)` would give a different answer (the empty list).
Thomas
In that case, the actual answer would be the union of `lst1.Except(lst2)` and `lst2.Except(lst1)`...
Dan Puzey
Agreed with @Thomas, this is completely not what the OP wanted. He wants to know which items are distinct in the set.
casperOne
@Dan: Exactly... but that is at least twice as slow, probably...
Thomas
@Thomas @Dan @casperOne thanks for the feedback. I've updated with what I believe should now yield the correct result.
Ahmad Mageed
@casper: To say it's completely not what the OP asked for is certainly not true. In fact, for the example *this produces the desired output*. However, @Thomas is correct in that it's asymmetric, so @Jon's answer is more correct.
Adam Robinson
Yeah, @Jon got what I needed. The solution here works in 95% of the cases, but there is still 5% of the cases that data will appear in the second list and not the first, and I need to report on that.
SlackerCoder
@Mario that has been addressed in my recent edit, so either solution will do now.
Ahmad Mageed
A: 

In Linq:

var distinct = lst2.Contact(lst1).Distinct().ToList();

(Your explanation appears to say that you want the unique elements in both lists. But your example appears that you want the union of both lists).

Keltex
This does not address the question at all. He doesn't ask for the distinct elements in each list, nor does he ask for a union. He wants the opposite of an intersection, which is `Except`.
Adam Robinson
He says 'I need to get an object containing the "distinct" list of both of these'. My interpretation is all the elements in both lists. However his example is contradictory to that.
Keltex
@Keltex: I would rely on the OP expressing his desire in a description rather than the proper use of terms.
Adam Robinson
+7  A: 

Ahmad is nearly right with Except, I believe - but that won't give you items which are in lst2 but not in lst1. So in the example you gave, if you added 5 to lst2, I imagine you'd want the result to be {2, 3, 5}. In that case, you want a symmetric difference. I don't think there's any way to do that directly in LINQ to Objects in a single call, but you can still achieve it. Here's a simple but inefficient way to do it:

lst1.Union(lst2).Except(lst1.Intersect(lst2)).ToList();

(Obviously you only need ToList() if you genuinely need a List<T> instead of an IEnumerable<T>.)

The way to read this is "I want items that are in either list but not in both."

It's possible that it would be more efficient to use Concat - which would still work as Except is a set based operator which will only return distinct results:

lst1.Concat(lst2).Except(lst1.Intersect(lst2)).ToList();
Jon Skeet
@Jon would you say your approach is faster than mine? I did a quick mini-benchmark and it looks that way, although probably negligible in the greater scheme of things. I guess the `Concat` helps versus using `Union`.
Ahmad Mageed
@Ahmad: Well, using Concat I'm only doing two set operations instead of three... it would take quite a bit of work to analyze it, I think. I think my answer is simpler to understand - although I suppose I would...
Jon Skeet
@Jon that's good enough for me. It makes sense that the number of set operations likely contributes to it since more work is done to determine which values match up for each respective operation. Thanks!
Ahmad Mageed
A: 

LINQ would be the easy way, grouping each item based on it's value and then selecting only the ones that have one element:

from i in lst1.Concat(lst2)
group i by i into g
where !g.Skip(1).Any()
select g.Key;

Using Skip here will allow you to make sure that no more than one element exists; if there is 1 or less elements in the sequence, an empty sequence is returned, on which Any will return false.

casperOne
A: 

Just that -

var distinct = a.Union(b);
Maxim