views:

5690

answers:

11

Given two lists (not necessarily sorted), what is the most efficient non-recursive algorithm to find the intersection of those lists?

+7  A: 

You could put all elements of the first list into a hash set. Then, iterate the second one and, for each of its elements, check the hash to see if it exists in the first list. If so, output it as an element of the intersection.

This sounds good, but I don't believe I have access to hashing algorithms either. Any suggestions?
Then, maybe:* sort list1 (time: n log n) * sort list2 (time: n log n) * merge the two and check for similar entries as you iterate the two sorted lists simultaneously (linear time)
I don't have enough points to comment on other threads, but regarding the point that quick sort is recursive: You can implement it without recursion. See here, for example: http://www.codeguru.com/forum/archive/index.php/t-333288.html
Thanks, added the details to my answer
Wookai
A: 

First, sort both lists using quicksort : O(n*log(n). Then, compare the lists by browsing the lowest values first, and add the common values. For example, in lua) :

function findIntersection(l1, l2)
    i, j = 1,1
    intersect = {}

    while i < #l1 and j < #l2 do
     if l1[i] == l2[i] then
      i, j = i + 1, j + 1
      table.insert(intersect, l1[i])
     else if l1[i] > l2[j] then
      l1, l2 = l2, l1
      i, j = j, i
     else
      i = i + 1
     end
    end

    return intersect
end

which is O(max(n, m)) where n and m are the sizes of the lists.

EDIT: quicksort is recursive, as said in the comments, but it looks like there are non-recursive implementations

Wookai
Isn't quicksort recursive? Or is there a non-recursive version of it?
Too bad that quicksort is a recursive algorithm...
oefe
I wouldn't call that O(max(n,m)). You're doing two sorts, too.
Tom Ritter
Is there a non-recursive version of mergesort that could work also?
Heapsort is non-recursive, but adds some data-structure needs. Would it be ok ?
Wookai
Heapsort might work. I'll have to look into it. Very frustrating -- this eviews language is clearly not meant for programmers.
Yep, looks so. Good luck with using it !
Wookai
+2  A: 

without hashing, I suppose you have two options:

  • The naive way is going to be compare each element to every other element. O(n^2)
  • Another way would be to sort the lists first, then iterate over them: O(n lg n) * 2 + 2 * O(n)
Tom Ritter
A: 

I got some good answers from this that you may be able to apply. I haven't got a chance to try them yet, but since they also cover intersections, you may find them useful.

StingyJack
A: 

Why not implement your own simple hash table or hash set? It's worth it to avoid nlogn intersection if your lists are large as you say.

Since you know a bit about your data beforehand, you should be able to choose a good hash function.

Imran
A: 

If there is a support for sets (as you call them in the title) as built-in usually there is a intersection method.

Anyway, as someone said you could do it easily (I will not post code, someone already did so) if you have the lists sorted. If you can't use recursion there is no problem. There are quick sort recursion-less implementations.

Andrea Ambu
1. If eviews would support sets, it would probably offer a method for set intersections2. How can joining two sets help here. The intersection are those elements that are in both sets. When I here joining I think of calculation the union of two sets
f3lix
Yup you're right, I did not pay attention. I edited :)
Andrea Ambu
java has support for sets but no built-in intersection functions.
lensovet
@lensovet: if it implements java.util.Set there is the java.util.Set.retainAll(Collection) method. Its documentation reads: "If the specified collection is also a set, this operation effectively modifies this set so that its value is the *intersection* of the two sets."
Andrea Ambu
oh wow, that's good to know. thanks!
lensovet
A: 

I second the "sets" idea. In JavaScript, you could use the first list to populate an object, using the list elements as names. Then you use the list elements from the second list and see if those properties exist.

Nosredna
A: 

In PHP, something like

function intersect($X) { // X is an array of arrays; returns intersection of all the arrays
  $counts = Array(); $result = Array();
  foreach ($X AS $x) {
    foreach ($x AS $y) { $counts[$y]++; }
  }
  foreach ($counts AS $x => $count) {
    if ($count == count($X)) { $result[] = $x; }
  }
  return $result;
}
A: 

in C++ the following can be tried using STL map

vector<int> set_intersection(vector<int> s1, vector<int> s2){

vector<int> ret;
map<int, bool> store;
for(int i=0; i < s1.size(); i++){

    store[s1[i]] = true;
}
for(int i=0; i < s1.size(); i++){

    if(store[s2[i]] == true) ret.push_back(s2[i]);

}
return ret;

}

quasar
A: 

You might want to take a look at Bloom filters. They are bit vectors that give a probabilistic answer whether an element is a member of a set. Set intersection can be implemented with a simple bitwise AND operation. If you have a large number of null intersections, the Bloom filter can help you eliminate those quickly. You'll still have to resort to one of the other algorithms mentioned here to compute the actual intersection, however. http://en.wikipedia.org/wiki/Bloom_filter

am
A: 

From the eviews features list it seems that it supports complex merges and joins (if this is 'join' as in DB terminology, it will compute an intersection). Now dig through your documentation :-)

Additionally, eviews has their own user forum - why not ask there_

zvrba