ansaurus

Question

Answer 1

+4 A:

Use HashSet<T> directly - it has a SymmetricExceptWith method:

HashSet<T> data = new HashSet<T>(a);
data.SymmetricExceptWith(b);

EDIT: If you want to maintain the order, here's an alternative:

HashSet<T> data = new HashSet<T>(a);
data.IntersectWith(b);
foreach (T t in a.Concat(b))
{
    if (!data.Contains(t))
    {
        yield return t;
    }
}

This has the following important differences:

Both a and b are iterated over twice. In some cases that could be a very bad thing - you could call ToList on each of them to start with to retain a buffer.
If there are duplicates in either a or b, they will be yielded multiple times. If you wanted to avoid this you could keep a set of already-yielded values. At this point, it would be equivalent to:
```
a.Concat(b).Except(a.Intersect(b))
```

That's still only two set operations instead of the three in your original code though.

Jon Skeet 2010-05-26 05:39:13

Thanks Jon for your quick reply. HashSet work fine as long as you are not interested in the original order of the items. What if I want to keep the order of the items in `a` and `b` in the difference?

Pierre 2010-05-26 05:48:21

@Pierre: I've edited my answer with another couple of options.

Jon Skeet 2010-05-26 06:05:12

@Jon: Thanks a lot for your time. In my case, `a` and `b` do not contain duplicates, so this is not a concern. The LINQ expression you propose is much more readable (and therefore maintainable) than the piece of code involving the `HashSet`. I like it!

Pierre 2010-05-26 06:16:07

Answer 2

+1 A:

Given a.Except(b) and b.Except(a) are disjoint, you can use concat instead of union, saving a set operator (and concat is more efficient).

return a.Except (b).Concat (b.Except (a));

This still runs through each list twice.

Cameron MacFarland 2010-05-26 06:26:48

@Cameron: Thank you; you are right, `Concat` will be faster than `Union` since my inputs are disjoint; I had overlooked that point.

Pierre 2010-05-26 12:57:49

ansaurus

tags:

views:

answers:

LINQ and set difference

related questions