tags:

views:

297

answers:

3
+3  Q: 

LINQ exclusion

Is there a direct LINQ syntax for finding the members of set A that are absent from set B? In SQL I would write this

SELECT A.* FROM A LEFT JOIN B ON A.ID = B.ID WHERE B.ID IS NULL
+8  A: 

Except operator

Quassnoi
That's the first time I've had an answer that was complete, directly applicable and effective.
Peter Wone
+3  A: 

I believe your LINQ would be something like the following.

var items = A.Except(
  from itemA in A
  from itemB in B
  where itemA.ID == itemB.ID
  select itemA);

Update

As indicated by Maslow in the comments, this may well not be the most performant query. As with any code, it is important to carry out some level of profiling to remove bottlenecks and inefficient algorithms. In this case, chaowman's answer provides a better performing result.

The reasons can be seen with a little examination of the queries. In the example I provided, there are at least two loops over the A collection - 1 to combine the A and B list, and the other to perform the Except operation - whereas in chaowman's answer (reproduced below), the A collection is only iterated once.

// chaowman's solution only iterates A once and partially iterates B
var results = from itemA in A
              where !B.Any(itemB => itemB.Id == itemA.Id)
              select itemA;

Also, in my answer, the B collection is iterated in its entirety for every item in A, whereas in chaowman's answer, it is only iterated upto the point at which a match is found.

As you can see, even before looking at the SQL generated, you can spot potential performance issues just from the query itself. Thanks again to Maslow for highlighting this.

Jeff Yates
This LINQ generated an absolutely awful SQL query in LINQPad that took 1 min 17 seconds to run against 2 very large tables. chaowman's took 7.5 seconds. I'd check the efficiency of each of these answers before implementing them large scale.
Maslow
@Maslow: You are right, there is no doubt that queries need to be profiled and optimized like any other piece of code, but it's important to get it working first and then optimize. Before you actually profiled these, would you really know which one was going to perform better.
Jeff Yates
yes, and it certainly could be a lack of keys in this legacy database.
Maslow
Good point. I didn't consider the keys side of things.
Jeff Yates
+2  A: 
var results = from itemA in A
              where !B.Any(itemB => itemB.Id == itemA.Id)
              select itemA;
chaowman
This query ran in 7.5 seconds against 2 large tables and generated a nice where not exists query.
Maslow
Nice. Certainly pays to profile. :)
Jeff Yates