tags:

views:

86

answers:

2

I am looking for the optimal (fastest) way to find the exact overlap between two arrays in numpy. Given two arrays x and y

x = array([1,0,3,0,5,0,7,4],dtype=int)
y = array([1,4,0,0,5,0,6,4],dtype=int)

What I want to get is, an array of the same length that contains only the numbers from both vectors that are equal:

array([1,0,0,0,5,0,0,4])

First I tried

x&y
array([1,0,0,0,5,0,6,4])

Then I realised that this is always true for two numbers if they are > 0.

+4  A: 
result = numpy.where(x == y, x, 0)

Have a look at numpy.where documentation for explanation. Basically, numpy.where(a, b, c), for a condition a returns an array of shape a, and with values from b or c, depending upon whether the corresponding element of a is true or not. b or c can be scalars.

By the way, x & y is not necessarily "always true" for two positive numbers. It does bitwise-and for elements in x and y:

x = numpy.array([2**p for p in xrange(10)])
# x is [  1   2   4   8  16  32  64 128 256 512]
y = x - 1
# y is [  0   1   3   7  15  31  63 127 255 511]
x & y
# result: [0 0 0 0 0 0 0 0 0 0]

This is because the bitwise representation of each element in x is of the form 1 followed bynzeros, and the corresponding element inyisn1s. In general, for two non-zero numbersaandb,a & bmay equal zero, or non-zero but not necessarily equal to eitheraorb`.

Alok
Perfect, that is what I was looking for, thanks.
Adrian
+1  A: 

Using numpy.where is the most general solution. but in this particular case, and because it is a useful programming practice, you could use x==y as a mask:

mask = x==y  
# mask is  array([ True, False, False,  True,  True,  True, False,  True], dtype=bool)

xf = mask * x
# xf is array([1, 0, 0, 0, 5, 0, 0, 4])

or directly

xf = (x==y) * x

imagine now some data X (e.g. 1D for sound, 2D for an image, 3D for a movie, etc...)

(X<1) * -1. + (X>1) * 1.

returns data with values -1 for an amplitude inferior to 1 and 1. otherwise.

meduz