views:

69

answers:

3

Say I have two arrays, items and removeItems and I wanted any values found in removeItems to be removed from items.

The brute force mechanism would probably be:

var animals = ["cow","dog","frog","cat","whale","salmon","zebra","tuna"];
var nonMammals = ["salmon","frog","tuna","spider"];
var mammals = [];
var isMammal;

for(var i=0;i<animals.length;i++){
   isMammal = true;
   for(var j=0;j<nonMammals;j++){
     if(nonMammals[j] === animals[i]){
       isMammal = false;
       break;
     }
   }
   if(isMammal){
     mammals.push(animals[i]);
   }
}

This is what? O(N^2)? Is there a more efficient way?

+5  A: 

That's actually O(M * N).

Probably you could do better sorting the animals array first, then doing a binary search. You'll be able to reduce to O(N * log N) - well, that's if log N < M anyway.

Anyway, if you're working with JS and that runs client-side, then just try to keep amount of data at minimum, or their browsers will yell at you with every request.

Seb
I was just writing something about sorting the lists first as well. You end up with n*lg(n) + m*lg(n) to sort the main list and then search main list m times, which is effectively m*lg(n) - much better than n^2
Stuart Branham
+3  A: 

With jQuery it's pretty easy:

function is_mammal(animal)
{
    return $.inArray(animal, nonMammals) == -1;
}

mammals = $.grep(animals, is_mammal);

See docs for $.grep and $.inArray.

tghw
That's easy, but doesn't make it any better than O(N^2). Just because you "hide" the loops doesn't mean $.inArray() and $.grep() don't have them.
Seb
+6  A: 

Basically what you want to do is efficiently compute the set difference S \ T. The (asymptotically) fastest way I know is to put T into a hashmap (that makes |T| steps) and go over each s in S (that makes |S| steps) checking wether s in T (which is O(1)). So you get to O(|T| + |S|) steps.

bayer
While this _is_ faster than my answer, it requires more memory for the hash table. There're pros and cons everywhere, right? :) Anyway, I believe this is the best solution, so there goes my +1.
Seb
Yes, that's why I wrote the asymptotically there. ;) Personally I'd guess that for small inputs even the squared solution suffices.
bayer