views:

849

answers:

12

I have two List's which I want to check for corresponding numbers.

for example

List<int> a = new List<int>(){1, 2, 3, 4, 5};
List<int> b = new List<int>() {0, 4, 8, 12};

Should give the result 4. Is there an easy way to do this without too much looping through the lists?

I'm on 3.0 for the project where I need this so no Linq.

+3  A: 

You can sort the second list and loop through the first one and for each value do a binary search on the second one.

Otávio Décio
+8  A: 

You can use the .net 3.5 .Intersect() extension method:-

List<int> a = new List<int>() { 1, 2, 3, 4, 5 };
List<int> b = new List<int>() { 0, 4, 8, 12 };

List<int> common = a.Intersect(b).ToList();
kronoz
Not sure if this will be available in 3.0 however...
kronoz
http://msdn.microsoft.com/de-de/library/bb460136.aspx ... nope
Andreas Niedermair
It's not, but thanks anyway. Now I know I can use this when we upgrade.
Gerrie Schenck
+1  A: 

var c = a.Intersect(b);

This only works in 3.5 saw your requirement my appologies.

JoshBerke
+2  A: 

If both lists are sorted, you can easily do this in O(n) time by doing a modified merge from merge-sort, simply "remove"(step a counter past) the lower of the two leading numbers, if they are ever equal, save that number to the result list and "remove" both of them. it takes less than n(1) + n(2) steps. This is of course assuming they are sorted. But sorting of integer arrays isn't exactly expensive O(n log(n))... I think. If you'd like I can throw together some code on how to do this, but the idea is pretty simple.

Chris J
Lists aren't sorted, and the amount of data isn't that big; Thanks anyway.
Gerrie Schenck
A: 

The method recommended by ocdecio is a good one if you're going to implement it from scratch. Looking at the time complexity compared to the nieve method we see:

Sort/binary search method: T ~= O(n log n) + O(n) * O(log n) ~= O(n log n)

Looping through both lists (nieve method): T ~= O(n) * O(n) ~= O(n ^ 2)

There may be a quicker method, but I am not aware of it. Hopefully that should justify choosing his method.

Noldorin
the binary search method would be O(n log n) + O(n log n), there would be 1 search per item in the list.
Chris J
Just fixed that right before you posted. Thanks anyway.
Noldorin
+11  A: 

Jeff Richter's excellent PowerCollections has Set with Intersections. Works all the way back to .NET 2.0.

http://www.codeplex.com/PowerCollections

     Set<int> set1 = new Set<int>(new[]{1,2,3,4,5});
 Set<int> set2 = new Set<int>(new[]{0,4,8,12});
 Set<int> set3 = set1.Intersection(set2);
kirkus
http://www.codeplex.com/PowerCollections
Chris S
+9  A: 

You could do it the way that LINQ does it, effectively - with a set. Now before 3.5 we haven't got a proper set type, so you'd need to use a Dictionary or something like that:

  • Create a Dictionary and populate is from list a using the element as both the key and the value for the entry. (The value in the entry really doesn't matter at all.)
  • Create a new list for the intersections (or write this as an iterator block, whatever).
  • Iterate through list b, and check with dictionary.ContainsKey: if it does, add an entry to the list or yield it.

That should be O(N+M) (i.e. linear in both list sizes)

Note that that will give you repeated entries if list b contains duplicates. If you wanted to avoid that, you could always change the value of the dictionary entry when you first see it in list b.

Jon Skeet
I'm assuming that Dictionary<Key,Value>.ContainsKey() is faster than List<T>.Contains()?
Daniel Schaffer
Dictionary lookup is O(1), List lookup is O(N)
aku
Of course, the O(N) stuff only actually matters for large numbers. For very small lists, List.Contains is likely to be faster.
Jon Skeet
@Jon Skeet, I agree in this case (author talked about 15-20 entries per list) List.Contains would be enough.
aku
A: 

Here is a method that removed duplicate strings. Change this to accomidate int and it will work fine.

public List<string> removeDuplicates(List<string> inputList)
    {
        Dictionary<string, int> uniqueStore = new Dictionary<string, int>();
        List<string> finalList = new List<string>();

        foreach (string currValue in inputList)
        {
            if (!uniqueStore.ContainsKey(currValue))
            {
                uniqueStore.Add(currValue, 0);
                finalList.Add(currValue);
            }
        }
        return finalList;

    }

Update: Sorry, I am actually combining the lists and then removing duplicates. I am passing the combined list to this method. Not exactly what you are looking for.

G Jason Smith
+2  A: 

In comment to question author said that there will be

Max 15 in the first list and 20 in the second list

In this case I wouldn't bother with optimizations and use List.Contains.

For larger lists hash can be used to take advantage of O(1) lookup that leads to O(N+M) algorithm as Jon noted.

Hash requires additional space. To reduce memory usage we should hash shortest list.

List<int> a = new List<int>() { 1, 2, 3, 4, 5 };
List<int> b = new List<int>() { 0, 4, 8, 12 };
List<int> shortestList;
List<int> longestList;
if (a.Count > b.Count)
{
    shortestList = b;
    longestList = a;
}
else
{
    shortestList = a;
    longestList = b;                
}

Dictionary<int, bool> dict = new Dictionary<int, bool>();
shortestList.ForEach(x => dict.Add(x, true));

foreach (int i in longestList)
{
    if (dict.ContainsKey(i))
    {
        Console.WriteLine(i);
    }
}
aku
A: 

(Previous answer - changed IndexOf to Contains, as IndexOf casts to an array first)

Seeing as it's two small lists the code below should be fine. Not sure if there's a library with an intersection method like Java has (although List isn't a set so it wouldn't work), I know as someone pointed out the PowerCollection library has one.

List<int> a = new List<int>() {1, 2, 3, 4, 5};
List<int> b = new List<int>() {0, 4, 8, 12};

List<int> result = new List<int>();
for (int i=0;i < a.Count;i++)
{
    if (b.Contains(a[i]))
     result.Add(a[i]);
}

foreach (int i in result)
    Console.WriteLine(i);

Update 2: HashSet was a dumb answer as it's 3.5 not 3.0

Update: HashSet seems like the obvious answer:

// Method 2 - HashSet from System.Core
HashSet<int> aSet = new HashSet<int>(a);
HashSet<int> bSet = new HashSet<int>(b);
aSet.IntersectWith(bSet);
foreach (int i in aSet)
    Console.WriteLine(i);
Chris S
Hi, the HashSet is only in 3.5
Ray Booysen
Yeh I was a bit keen with that
Chris S
+2  A: 

Tested on 3.0

    List<int> a = new List<int>() { 1, 2, 3, 4, 5, 12, 13 };
    List<int> b = new List<int>() { 0, 4, 8, 12 };
    List<int> intersection = new List<int>();
    Dictionary<int, int> dictionary = new Dictionary<int, int>();
    a.ForEach(x => { if(!dictionary.ContainsKey(x))dictionary.Add(x, 0); });
    b.ForEach(x => { if(dictionary.ContainsKey(x)) dictionary[x]++; });
    foreach(var item in dictionary)
    {
        if(item.Value > 0)
            intersection.Add(item.Key);
    }
pablito
Good work. Only one iteration over each list and using the dictionary.
Ray Booysen
This seems like a good naive implementation. It'll do the job, but it won't scale well with all the memory and computational overhead.
spoulson
it does the work in o(n), anyway there are different solutions to different problems, sometimes the naive solution is better than the generic optimized solution if the input is not polinomial. Given the current problem I think the naive solution will do the work
pablito
A: 
Jerry
This way you're doing it means that for each item in b you're looping through each item in b so performance dies on larger sets of data.
Ray Booysen
Essentially O(M + M*N)) instead of the dictionary examples which have O(M+N) performance.
Ray Booysen
He asked how it could be done without Linq. I didn't see anything about being done with the least number of CPU cycles. No, it's not the speediest, but it does work, and is easy to read/maintain.
Jerry