views:

295

answers:

5

Hello everyone,

here is my problem: I would like to create a boolean matrix B that contains True everywhere that matrix A has a value contained in vector v. One inconvenient solution would be:

import numpy as np
>>> A = np.array([[0,1,2], [1,2,3], [2,3,4]])
array([[0, 1, 2],
       [1, 2, 3],
       [2, 3, 4]])
>>> v = [1,2]
>>> B = (A==v[0]) + (A==v[1]) # matlab: ``B = ismember(A,v)``
array([[False,  True,  True],
       [ True,  True, False],
       [ True, False, False]], dtype=bool)

Is there maybe a solution that would be more convenient if A and v would have more values?

Cheers!

+1  A: 

Here's a naive one-liner:

[any (value in item for value in v) for item in A]

Sample output:

>>> A = ( [0,1,2], [1,2,3], [2,3,4] )
>>> v = [1,2]
>>> [any (value in item for value in v) for item in A]
[True, True, True]
>>> v = [1]
>>> [any (value in item for value in v) for item in A]
[True, True, False]

It's a very Pythonic approach, but I'm certain it won't scale well on large arrays or vectors because Python's in operator is a linear search (on lists/tuples, at least).

As Brooks Moses pointed out in the below comment, the output should be a 3x3 matrix. That's why you give sample output in your questions. (Thanks Brooks)

>>> v=[1,2]
>>> [ [item in v for item in row] for row in A]
[[False, True, True], [True, True, False], [True, False, False]]
>>> v=[1]
>>> [ [item in v for item in row] for row in A]
[[False, True, False], [True, False, False], [False, False, False]]
Mark Rushakoff
You've got a good answer to the wrong question, I think -- you want A to be a 3x3 array, and return a 3x3 truth value for each of those 9 elements. Thus, adjusting your answer slightly: [[(item in v) for item in row] for row in A] works fine. I'm also curious why you expect this would be slow.
Brooks Moses
+4  A: 

I don't know much numpy, be here's a raw python one:

>>> A = [[0,1,2], [1,2,3], [2,3,4]]
>>> v = [1,2]
>>> B = [map(lambda val: val in v, a) for a in A]
>>>
>>> B
[[False, True, True], [True, True, False], [True, False, False]]

Edit: As Brooks Moses notes and some simple timing seems to show, this one is probably be better:

>>> B = [ [val in v for val in a] for a in A]
balpha
+1: This works with numpy arrays.
ire_and_curses
Naive question: Why the map(lambda...) syntax, rather than just [(val in v) for v in a]? Is there a meaningful difference in this case?
Brooks Moses
@Brooks Moses: You're right, I guess there's not, and the double comprehension even seems to be a little faster (I only did some naive timing, though). Edited.
balpha
Actually, for small v and large A, it's a lot faster (factor 2).
balpha
+3  A: 

Using numpy primitives:

>>> import numpy as np
>>> A = np.array([[0,1,2], [1,2,3], [2,3,4]])
>>> v = [1,2]
>>> print np.vectorize(lambda x: x in v)(A)
[[False  True  True]
 [ True  True False]
 [ True False False]]

For non-tiny inputs convert v to a set first for a large speedup.

To use numpy.setmember1d:

Auniq, Ainv = np.unique1d(A, return_inverse=True)
result = np.take(np.setmember1d(Auniq, np.unique1d(v)), Ainv).reshape(A.shape)
Ants Aasma
This is broken: see the 2nd row, rightmost column -- what's that True doing there? It corresponds to 3 in A which is NOT in v. Alas, setmember1d does NOT support correctly arrays with duplicates.
Alex Martelli
Corrected, setmember1d documentation could be clearer on this.
Ants Aasma
+1  A: 

I think the closest you'll get is numpy.ismember1d, but it won't work well with your example. I think your solution (B = (A==v[0]) + (A==v[1])) may actually be the best one.

ars
+1  A: 

Alas, setmember1d as it exists in numpy is broken when either array has duplicated elements (as A does here). Download this version, call it e.g sem.py somewhere on your sys.path, add to it a first line import numpy as nm, and THEN this finally works:

>>> import sem
>>> print sem.setmember1d(A.reshape(A.size), v).reshape(A.shape)
[[False True True]
 [True True False]
 [True False False]]

Note the difference wrt @Aants' similar answer: this version has the second row of the resulting bool array correct, while his version (using the setmember1d that comes as part of numpy) incorrectly has the second row as all Trues.

Alex Martelli