views:

773

answers:

5

I want to find all items in one collection that do not match another collection. The collections are not of the same type, though; I want to write a lambda expression to specify equality.

A LINQPad example of what I'm trying to do:

void Main()
{
    var employees = new[]
    {
        new Employee { Id = 20, Name = "Bob" },
        new Employee { Id = 10, Name = "Bill" },
        new Employee { Id = 30, Name = "Frank" }
    };

    var managers = new[]
    {
        new Manager { EmployeeId = 20 },
        new Manager { EmployeeId = 30 }
    };

    var nonManagers =
    from employee in employees
    where !(managers.Any(x => x.EmployeeId == employee.Id))
    select employee;

    nonManagers.Dump();

    // Based on cdonner's answer:

    var nonManagers2 =
    from employee in employees
    join manager in managers
        on employee.Id equals manager.EmployeeId
    into tempManagers
    from manager in tempManagers.DefaultIfEmpty()
    where manager == null
    select employee;

    nonManagers2.Dump();

    // Based on Richard Hein's answer:

    var nonManagers3 =
    employees.Except(
        from employee in employees
        join manager in managers
            on employee.Id equals manager.EmployeeId
        select employee);

    nonManagers3.Dump();
}

public class Employee
{
    public int Id { get; set; }
    public string Name { get; set; }
}

public class Manager
{
    public int EmployeeId { get; set; }
}

The above works, and will return Employee Bill (#10). It does not seem elegant, though, and it may be inefficient with larger collections. In SQL I'd probably do a LEFT JOIN and find items where the second ID was NULL. What's the best practice for doing this in LINQ?

EDIT: Updated to prevent solutions that depend on the Id equaling the index.

EDIT: Added cdonner's solution - anybody have anything simpler?

EDIT: Added a variant on Richard Hein's answer, my current favorite. Thanks to everyone for some excellent answers!

+2  A: 
    /// <summary>
    /// This method returns items in a set that are not in 
    /// another set of a different type
    /// </summary>
    /// <typeparam name="T"></typeparam>
    /// <typeparam name="TOther"></typeparam>
    /// <typeparam name="TKey"></typeparam>
    /// <param name="items"></param>
    /// <param name="other"></param>
    /// <param name="getItemKey"></param>
    /// <param name="getOtherKey"></param>
    /// <returns></returns>
    public static IEnumerable<T> Except<T, TOther, TKey>(
                                           this IEnumerable<T> items,
                                           IEnumerable<TOther> other,
                                           Func<T, TKey> getItemKey,
                                           Func<TOther, TKey> getOtherKey)
    {
        return from item in items
               join otherItem in other on getItemKey(item)
               equals getOtherKey(otherItem) into tempItems
               from temp in tempItems.DefaultIfEmpty()
               where ReferenceEquals(null, temp) || temp.Equals(default(TOther))
               select item;
    }

I don't remember where I found this method.

cdonner
+1 - Nice. I modified this slightly and incorporated it in my question. I want to see what others come up with, though. Thanks!
TrueWill
+2  A: 

Have a look at the Except() LINQ function. It does exactly what you need.

nitzmahone
The except function only works with 2 sets of the same object type, but would not direclty apply to his example with employees and managers. Therefore the overloaded method in my answer.
cdonner
+2  A: 
var nonmanagers = employees.Select(e => e.Id)
    .Except(managers.Select(m => m.EmployeeId))
    .Select(id => employees.Single(e => e.Id == id));
gWiz
There is no guarantee that the EmployeeId will match the employee index in the array...
Thomas Levesque
Nice idea - I didn't think of selecting the IDs so that Except with the default equality comparer would compare integers. However Mr. Levesque is correct, and I've updated the example to reflect this. Can you provide an example that correctly returns the employees?
TrueWill
Ah you're right. Answer has been updated.
gWiz
(I deleted my previous comment - gWiz is right; this will work.)
TrueWill
+1  A: 

         var nonManagers = ( from e1 in employees
                             select e1 ).Except(
                                   from m in managers
                                   from e2 in employees
                                   where m.EmployeeId == e2.Id
                                   select e2 );

Partha Choudhury
+1. Elegant and works correctly.
TrueWill
Thanks. Originally found it here : http://rsanidad.wordpress.com/2007/10/16/linq-except-and-intersect/
Partha Choudhury
+4  A: 

This is almost the same as some other examples but less code:

employees.Except(employees.Join(managers, e => e.Id, m => m.EmployeeId, (e, m) => e));

It's not any simpler than employees.Where(e => !managers.Any(m => m.EmployeeId == e.Id)) or your original syntax, however.

Richard Hein
Actually I like this better than the other solutions - I find its meaning clearer. I rewrote the join in query syntax (see the revised sample code in my question) out of personal preference. Thank you!
TrueWill