views:

401

answers:

3

I thought that I understood Intersect, but it turns out I was wrong.

 List<int> list1 = new List<int>() { 1, 2, 3, 2, 3};
 List<int> list2 = new List<int>() { 2, 3, 4, 3, 4};

 list1.Intersect(list2) =>      2,3

 //But what I want is:
 // =>  2,3,2,3,2,3,3

I can figure a way like:

 var intersected = list1.Intersect(list2);
 var list3 = new List<int>();
 list3.AddRange(list1.Where(I => intersected.Contains(I)));
 list3.AddRange(list2.Where(I => intersected.Contains(I)));

Is there a easier way in LINQ to achieve this?

=== UPDATE ===

I do need to state that I do not care in which order the results are given.

2,2,2,3,3,3,3 would also be perfectly OK.

Problem is that I am using this on a very large collection, So I need efficiency.

=== UPDATE 2 === Yes, we are talking about Objects, not ints. The ints were just for the easy example, but I realize this can make a difference.

However, I found a better way to avoid the problem of having huge collections, so the given answer is good enough for my situation and understanding.

A: 

I don't believe this is possible with the built-in APIs. But you could use the following to get the result you're looking for.

IEnumerable<T> Intersect2<T>(this IEnumerable<T> left, IEnumerable<T> right) {
  var map = left.ToDictionary(x => x, y => false);
  foreach ( var item in right ) {
    if (map.ContainsKey(item) ) {
      map[item] = true;
    }
  }
  foreach ( var cur in left.Concat(right) ) {
    if ( map.ContainsKey(cur) ) {
      yield return cur;
    }
  }
}
JaredPar
+5  A: 

Let's see if we can precisely characterize what you want. Correct me if I am wrong. You want: all elements of list 1, in order, that also appear in list 2, followed by all elements of list 2, in order, that also appear in list 1. Yes?

Seems straightforward.

return list1.Where(x=>list2.Contains(x))
     .Concat(list2.Where(y=>list1.Contains(y))
     .ToList();

Note that this is not efficient for large lists. If the lists have a thousand items each then this does a couple million comparisons. If you're in that situation then you want to use a more efficient data structure for testing membership:

list1set = new HashSet(list1);
list2set = new HashSet(list2);

return list1.Where(x=>list2set.Contains(x))
     .Concat(list2.Where(y=>list1set.Contains(y))
     .ToList();

which only does a couple thousand comparisons, but potentially uses more memory.

Eric Lippert
Your LINQ queries do not give the same results as your other two queries - if element e occurs n times in list1 and m in list2, they contain it n*m times, which isn't the desired behavior.
kvb
*Excellent catch* @kvb. I totally missed that because in the given example, they happen to look confusingly similar. I'll remove the incorrect code. Thanks!
Eric Lippert
Interesting about the HashSet. I didn't know that it was more efficient. Will look into it!
Peterdk
@Peterdk: Lists are O(n) to test for membership of an element, but in exchange give you the ability to (1) maintain an ordering, and (2) have duplicates. HashSets are O(1) to test for membership of an element but do not keep the elements in order and never contain duplicates. If you're willing to double your memory and use both a HashSet *and* a List, you can get the best of both worlds.
Eric Lippert
+1  A: 
var set = new HashSet(list1.Intersect(list2));
return list1.Concat(list2).Where(i=>set.Contains(i));
George Polevoy