tags:

views:

658

answers:

4

Hi, I have a list of lists which I want to find the intersection for like this:

var list1 = new List<int>() { 1, 2, 3 };
var list2 = new List<int>() { 2, 3, 4 };
var list3 = new List<int>() { 3, 4, 5 };
var listOfLists = new List<List<int>>() { list1, list2, list3 };

// expected intersection is List<int>() { 3 };

Is there some way to do this with IEnumerable.Intersect()?

EDIT: I should have been more clear on this: I really have a list of lists, I don't know how many there will be, the three lists above was just an example, what I have is actually an IEnumerable<IEnumerable<SomeClass>>

SOLUTION

Thanks for all great answers. It turned out there were four options for solving this: List+aggregate (@Marcel Gosselin), List+foreach (@JaredPar, @Gabe Moothart), HashSet+aggregate (@jesperll) and HashSet+foreach (@Tony the Pony). I did some performance testing on these solutions (varying number of lists, number of elements in each list and random number max size.

It turns out that for most situations the HashSet performs better than the List (except with large lists and small random number size, because of the nature of HashSet I guess.) I couldn't find any real difference between the foreach method and the aggregate method (the foreach method performs slightly better.)

To me, the aggregate method is really appealing (and I'm going with that as the accepted answer) but I wouldn't say it's the most readable solution.. Thanks again all!

+3  A: 

You could do the following

var result = list1.Intersect(list2).Intersect(list3).ToList();
JaredPar
Thanks, but I really have a list of lists, not three separate lists.. I need something that works independent of how many lists there are in listOfLists.
Oskar
@Oskar You could easily run that in a loop
Gabe Moothart
+5  A: 

You can indeed use Intersect twice. However, I believe this will be more efficient:

HashSet<int> hashSet = new HashSet<int>(list1);
hashSet.IntersectWith(list2);
hashSet.IntersectWith(list3);
List<int> intersection = hashSet.ToList();

Not an issue with small sets of course, but if you have a lot of large sets it could be significant.

Basically Enumerable.Intersect needs to create a set on each call - if you know that you're going to be doing more set operations, you might as well keep that set around.

As ever, keep a close eye on performance vs readability - the method chaining of calling Intersect twice is very appealing.

EDIT: For the updated question:

public List<T> IntersectAll<T>(IEnumerable<IEnumerable<T>> lists)
{
    HashSet<T> hashSet = null;
    foreach (var list in lists)
    {
        if (hashSet == null)
        {
            hashSet = new HashSet<T>(list);
        }
        else
        {
            hashSet.IntersectWith(list);
        }
    }
    return hashSet == null ? new List<T>() : hashSet.ToList();
}

Or if you know it won't be empty, and that Skip will be relatively cheap:

public List<T> IntersectAll<T>(IEnumerable<IEnumerable<T>> lists)
{
    HashSet<T> hashSet = new HashSet<T>(lists.First());
    foreach (var list in lists.Skip(1))
    {
        hashSet.IntersectWith(list);
    }
    return hashSet.ToList();
}
Jon Skeet
@Skeet "Tony the Pony" ?
Gabe Moothart
Thanks 'Tony', but please see my updated questions
Oskar
Yeah, the foreach makes sense. Any performance difference with this compared to the Aggregate method in Marcel's answer?
Oskar
I believe Tony's answer will have a better performance.
Marcel Gosselin
@Oskar: Yes, my answer uses a single hashset instead of creating a new one each time. However, you could still use Aggregate with a set... will edit.
Jon Skeet
Ick... just tried to work out an Aggregate solution, and it's icky because HashSet.IntersectWith returns null :(
Jon Skeet
+3  A: 

Try this, it works but I'd really like to get rid of the .ToList() in the aggregate.

var list1 = new List<int>() { 1, 2, 3 };
var list2 = new List<int>() { 2, 3, 4 };
var list3 = new List<int>() { 3, 4, 5 };
var listOfLists = new List<List<int>>() { list1, list2, list3 };
var intersection = listOfLists.Aggregate((previousList, nextList) => previousList.Intersect(nextList).ToList());
Marcel Gosselin
Thanks, I just tried that out and it works! Havn't used Aggregate() before but I guess it was something like this I was looking for.
Oskar
+1: clean and cool solution.
Partha Choudhury
As I have specified as a comment on Tony's answer, I believe his solution will perform better.
Marcel Gosselin
+1  A: 

How about:

var intersection = listOfLists.Skip(1).Aggregate(new HashSet<T>(listOfLists.First()), (h, e) => { h.IntersectWith(e); return h; });

That way it's optimized by using the same HashSet throughout and still in a single statement. Just make sure that the listOfLists always contains at least one list.

jesperll