ansaurus

Question

Answer 1

+3 A:

The filter constructed by linear search can be simplified to:

MemberQ[b, #]& /@ a

Anyway, you could construct a sparse array from b by:

s = SparseArray[b -> ConstantArray[True, Length@b], Max[a, b], False];

then for indices i in the sparse array, s[[i]] will return True, and for those outside, s[[i]] will return False. So the list can be constructed with

s[[#]]& /@ a

We can compare the results:

In[130]:= alist = Prime /@ Range[2000];
          blist = alist + 2;

In[165]:= Timing[MemberQ[blist, #]& /@ alist;]

Out[165]= {0.91024,Null}

In[168]:= Timing[
           s = SparseArray[blist -> ConstantArray[True, Length@blist],
                           Max[alist, blist], False];
           s[[#]]& /@ alist;]

Out[168]= {0.017772,Null}

KennyTM 2010-09-12 19:26:11

Regarding your first sentence, probably the asker just meant MemberQ[b, element] instead of MemberQ[element,b]. Not to say your version isn't better.

dreeves 2010-09-12 20:18:42

@dreeves: Actually I am talking about the `Table[..., {element,a}]` part. The MemberQ arguments is a minor thing.

KennyTM 2010-09-12 20:31:18

But there's nothing wrong with the Table[..., {element,a}] part (except perhaps that it's not as efficient/elegant as your Map-based version). To remove any ambiguity about this, I think I'll go ahead and fix the typo in the question so you can actually run the asker's example. [Done; added expected output too.]

dreeves 2010-09-13 05:11:19

@dreeves Thanks for the clarification. I deleted my answer.

belisarius 2010-09-13 05:36:54

@dreeves: Nevermind. It was the problem of my testing code which caused `Table[..., {element, a}]` to throw an error.

KennyTM 2010-09-13 07:31:33

Answer 2

+6 A:

The following question is equivalent to yours and contains an answer in Mathematica:

http://stackoverflow.com/questions/1954636/given-list-subset-of-list-return-bitmask

Here's that answer (which I'll feel free to steal since it's actually mine):

bitmask[a_, b_] := Module[{f},
  f[_] = False;
  (f[#] = True)& /@ b;
  f /@ a]

The idea is to make a hash table, f, that can answer in constant time whether a given element is a member of list b. First f is given a default value of False (if we haven't seen it before it's not in b). Then a single pass through the list b is made, setting f[i] to True for each i in b. That's all the set up. Now a single pass of the hash function f over the list a gives us the answer. Total time is O(n+m) -- one pass through each list.

dreeves 2010-09-12 20:29:43

Thanks a million, works like a charm. Where do you learn the subtle nuances of pattern matching anyway? Like how do you know it's roughly O(1) time, etc. The only thing I really dislike about mathematica, is that even though it's supposed to be mathematicy, it almost never lists complexities.

Gleno 2010-09-13 11:47:58

@Gleno I think your comment is a very good seed for another question :)

belisarius 2010-09-13 18:45:55

@dreeves Very elegant! +1

belisarius 2010-09-13 18:46:49

@Gleno: I'm don't think of this as involving subtleties of pattern matching but rather just following Mathematica's syntax for creating a hash function (aka dictionary). I added some exposition about the algorithm in case that's useful.

dreeves 2010-09-13 19:16:28

There are some subtleties with function definitions for Module[] variables. I tend to use Dispatch[] to avoid these issues, or call Clear[f] explicitly to make sure everything gets cleaned up.

Joshua Martell 2010-09-15 01:53:50

@Joshua: I'd love to learn more about that since this is something I blithely do all the time. The f in the Module is really something like f$246 -- unique to the Module -- so I'm having trouble imagining what could go wrong.

dreeves 2010-09-15 15:47:34

I'm not able to reproduce now, but as I recall, it had something to do with setting downvalues of Module[] or Block[]'ed variables. It's possible it was fixed in V7.

Joshua Martell 2010-09-16 03:07:08

Answer 3

+5 A:

Another idea would be to use rules and Dispatch. It seems to be faster when the length of the blist is large:

alist = Prime /@ Range[20000];
blist = alist + 2;

membQ = First@Timing[MemberQ[blist,#]&/@alist;]

sparse = First@Timing[
    s = SparseArray[blist -> ConstantArray[True, Length@blist], Max[alist, blist], False];
    s[[#]] & /@ alist;
]

rules = First@Timing[
    bRules = Dispatch[Append[Rule[#, True] & /@ blist, _ -> False]];
    (# /. bRules) & /@ alist;
]

fct = First@Timing[
    f[_] = False;
    (f[#] = True) & /@ blist;
    f /@ alist;
]

On my laptop, rules (0.06s) < fct (0.09s) < sparse (0.9s) < membQ (40s).

Edit / Comparing fct and rules timings

@Karsten please feel free to rollback to your original answer

Tables generated with Prime[...]

alt text

Tables generated with RandomInteger[...]

alt text

Karsten W. 2010-09-13 06:02:11

Thanks for doing the timing comparison. I think you deserve Accepted Answer for that even if you didn't also have the fastest answer. But about that, the .06 vs .09 is unconvincing -- want to try it with order of magnitude bigger input to make sure it holds up?

dreeves 2010-09-13 07:30:04

Which one is better, seems to depend on alist and blist, too. With alist and blist set to RandomInteger[**100000**, 800000], the Dispatch solution is with 5.2s better than the functional solution (5.4), while on RandomInteger[**10000**, 800000] the functional solution is faster (5.3s vs. 5.7s). The sparse solution emits error messages, which I did not look into..

Karsten W. 2010-09-13 08:21:52

I re-awarded accepted answer to you, because dreeves request it. Thanks for timing comparisons, I feel somewhat uneasy wasting people's time like that. :)

Gleno 2010-09-13 11:54:24

@Karsten: Interesting, seems like sparse and functional are rather neck and neck though it looks like functional is simpler, especially if we wanted to figure out how to suppress sparse's error messages.

dreeves 2010-09-13 18:58:31

@Gleno: Certainly don't feel uneasy. You asked an awesome question that inspired people to do some great work that will help the whole community. You're living the StackOverflow dream! :)

dreeves 2010-09-13 19:01:29

@dreeves @Karsten .. added two comparative plots

belisarius 2010-09-13 20:02:41

@belisarius: looks good! @Gleno: it's fun!

Karsten W. 2010-09-14 06:14:53

ansaurus

tags:

views:

answers:

MemberQ in Mathematica

related questions