tags:

views:

848

answers:

7

I have two lists:

A = [0,0,0,1,0,1]
B = [0,0,1,1,1,1]

I want to find the number of 1s in the same position in both lists.

The answer for these arrays would be 2.

+1  A: 

I'm not an expert of Python, but what is wrong with a simple loop from start to end of first array?

In C# I would do something like:

int match=0;

for (int cnt=0; cnt< A.Count;cnt++)
{
    if ((A[cnt]==B[cnt]==1)) match++;
}

Would that be possible in your language?

BerggreenDK
"Would that be possible in your language?" Yes. :)
ΤΖΩΤΖΙΟΥ
Possible, yes, but not idiomatic. Slightly faster for long lists, it would seem. For two random lists of length 10'000, your solution (translated to python) took 0.003s on my system, while 'drakosha' took 0.005 and 'bendin' took 0.020s.
bendin
Stranger yet, for 1'000'000 elements: berggreendk=0.031s, bendin=0.045, drakosha=1.02s which probably doesn't prove anything other than the fact that micro-benchmarking is hard to do right.
bendin
So does this conclude that my algoritm construction would be the fastest or did I misinterpret these comments?
BerggreenDK
@BerggreenDK: I believe your algorithm is indeed faster, and not just for long lists. It's actually kind of surprising and disappointing, since it is so clearly not idiomatic.
John Y
+17  A: 

A little shorter and hopefully more pythonic way:

>>> A=[0,0,0,1,0,1]
>>> B=[0,0,1,1,1,1]

x = sum(1 for a,b in zip(A,B) if (a==b==1))
>>> x
2
Drakosha
Great answer. As python allows you to chain comparison operators you can have (a == b == 1) for the if.
Dave Webb
Thanks, updated
Drakosha
you don't need brackets around comparison
SilentGhost
I like this answer, just uses native python function however the performance is suffering, take a look at http://stackoverflow.com/questions/877059/comparing-two-arrays/877450#877450 for some performance test against a simple numpy version
Daniel Persson
+1  A: 

Motivated by brief need to be perverse, I offer the following solution:

A = [0,0,0,1,0,1]
B = [0,0,1,1,1,1]

print len(set(i for i, n in enumerate(A) if n == 1) &
          set(i for i, n in enumerate(B) if n == 1))

(Drakosha's suggestion is a far more reasonable way to solve this problem. This just demonstrates that one can often look at the same problem in different ways.)

bendin
Nice idea, however it's O(NlogN) opposed to my O(N) solution. You'll need to supply preversed and fast version ;)
Drakosha
A: 

With SciPy:

>>> from scipy import array
>>> A=array([0,0,0,1,0,1])
>>> B=array([0,0,1,1,1,1])

>>> A==B
array([ True,  True, False,  True, False,  True], dtype=bool)
>>> sum(A==B)
4

>>> A!=B
array([False, False,  True, False,  True, False], dtype=bool)
>>> sum(A!=B)
2
wr
+2  A: 

Slightly shorter variation of Drakosha's:

>>> A = [0,0,0,1,0,1]
>>> B = [0,0,1,1,1,1] 
>>> sum(a*b for a, b in zip(A, B) )
2
John Pirie
Nice, but you should comment that this applies strictly for elements ∈ {0, 1}. I know that this is the specification, but EIBTI :)
ΤΖΩΤΖΙΟΥ
A: 

Here comes another method which exploits the fact that the array just contains zeros and ones.

The scalar product of two vectors x and y is sum( x(i)*y(i) ) the only situation yielding a non zero result is if x(i)==y(i)==1 thus using numpy for instance

from numpy import *
x = array([0,0,0,1,0,1])
y = array([0,0,1,1,1,1])
print dot(x,y)

simple and nice. This method does n multiplications and adds n-1 times, however there are fast implementations using SSE, GPGPU, vectorisation, (add your fancy word here) for dot products (scalar products)

I timed the numpy-method against this method:

sum(1 for a,b in zip(x,y) if (a==b==1))

and found that for 1000000 loops the numpy-version did it in 2121ms and the zip-method did it in 9502ms thus the numpy-version is a lot faster

I did a better analysis of the efectivness and found that for n element(s) in the array the zip method took t1 ms and the dot product took t2 ms for one itteration

elements      zip       dot
1          0.0030    0.0207
10         0.0063    0.0230
100        0.0393    0.0476
1000       0.3696    0.2932
10000      7.6144    2.7781
100000   115.8824   30.1305

From this data one could draw the conclusion that if the number of elements in the array is expected to (in mean) be more than 350 (or say 1000) one should consider to use the dot-product method instead.

Daniel Persson
9.5 us vs 2.1 us is not exactly "performance suffering".
SilentGhost
See the added information about array-length comparisons ...
Daniel Persson
A: 
[A[i]+B[i] for i in range(min([len(A), len(B)]))].count(2)

Basically this just creates a new list which has all the elements of the other two added together. You know there were two 1's if the sum is 2 (assuming only 0's and 1's in the list). Therefore just perform the count operation on 2.

Matt