ansaurus

Question

comparing two arrays

Answer 1

+1 A:

I'm not an expert of Python, but what is wrong with a simple loop from start to end of first array?

In C# I would do something like:

int match=0;

for (int cnt=0; cnt< A.Count;cnt++)
{
    if ((A[cnt]==B[cnt]==1)) match++;
}

Would that be possible in your language?

BerggreenDK 2009-05-18 10:29:06

"Would that be possible in your language?" Yes. :)

ΤΖΩΤΖΙΟΥ 2009-05-18 16:00:41

Possible, yes, but not idiomatic. Slightly faster for long lists, it would seem. For two random lists of length 10'000, your solution (translated to python) took 0.003s on my system, while 'drakosha' took 0.005 and 'bendin' took 0.020s.

bendin 2009-05-20 08:06:35

Stranger yet, for 1'000'000 elements: berggreendk=0.031s, bendin=0.045, drakosha=1.02s which probably doesn't prove anything other than the fact that micro-benchmarking is hard to do right.

bendin 2009-05-20 08:12:14

So does this conclude that my algoritm construction would be the fastest or did I misinterpret these comments?

BerggreenDK 2009-05-20 21:04:49

@BerggreenDK: I believe your algorithm is indeed faster, and not just for long lists. It's actually kind of surprising and disappointing, since it is so clearly not idiomatic.

John Y 2009-05-24 04:02:52

Answer 2

+17 A:

A little shorter and hopefully more pythonic way:

>>> A=[0,0,0,1,0,1]
>>> B=[0,0,1,1,1,1]

x = sum(1 for a,b in zip(A,B) if (a==b==1))
>>> x
2

Drakosha 2009-05-18 10:29:26

Great answer. As python allows you to chain comparison operators you can have (a == b == 1) for the if.

Dave Webb 2009-05-18 10:36:00

Thanks, updated

Drakosha 2009-05-18 11:20:11

you don't need brackets around comparison

SilentGhost 2009-05-18 12:34:23

I like this answer, just uses native python function however the performance is suffering, take a look at http://stackoverflow.com/questions/877059/comparing-two-arrays/877450#877450 for some performance test against a simple numpy version

Daniel Persson 2009-05-18 12:41:13

Answer 3

+1 A:

Motivated by brief need to be perverse, I offer the following solution:

A = [0,0,0,1,0,1]
B = [0,0,1,1,1,1]

print len(set(i for i, n in enumerate(A) if n == 1) &
          set(i for i, n in enumerate(B) if n == 1))

(Drakosha's suggestion is a far more reasonable way to solve this problem. This just demonstrates that one can often look at the same problem in different ways.)

bendin 2009-05-18 10:40:34

Nice idea, however it's O(NlogN) opposed to my O(N) solution. You'll need to supply preversed and fast version ;)

Drakosha 2009-05-18 11:26:19

Answer 4

A:

With SciPy:

>>> from scipy import array
>>> A=array([0,0,0,1,0,1])
>>> B=array([0,0,1,1,1,1])

>>> A==B
array([ True,  True, False,  True, False,  True], dtype=bool)
>>> sum(A==B)
4

>>> A!=B
array([False, False,  True, False,  True, False], dtype=bool)
>>> sum(A!=B)
2

wr 2009-05-18 11:23:16

Answer 5

+2 A:

Slightly shorter variation of Drakosha's:

>>> A = [0,0,0,1,0,1]
>>> B = [0,0,1,1,1,1] 
>>> sum(a*b for a, b in zip(A, B) )
2

John Pirie 2009-05-18 12:15:33

Nice, but you should comment that this applies strictly for elements ∈ {0, 1}. I know that this is the specification, but EIBTI :)

ΤΖΩΤΖΙΟΥ 2009-05-18 16:04:14

Answer 6

A:

Here comes another method which exploits the fact that the array just contains zeros and ones.

The scalar product of two vectors x and y is sum( x(i)*y(i) ) the only situation yielding a non zero result is if x(i)==y(i)==1 thus using numpy for instance

from numpy import *
x = array([0,0,0,1,0,1])
y = array([0,0,1,1,1,1])
print dot(x,y)

simple and nice. This method does n multiplications and adds n-1 times, however there are fast implementations using SSE, GPGPU, vectorisation, (add your fancy word here) for dot products (scalar products)

I timed the numpy-method against this method:

sum(1 for a,b in zip(x,y) if (a==b==1))

and found that for 1000000 loops the numpy-version did it in 2121ms and the zip-method did it in 9502ms thus the numpy-version is a lot faster

I did a better analysis of the efectivness and found that for n element(s) in the array the zip method took t1 ms and the dot product took t2 ms for one itteration

elements      zip       dot
1          0.0030    0.0207
10         0.0063    0.0230
100        0.0393    0.0476
1000       0.3696    0.2932
10000      7.6144    2.7781
100000   115.8824   30.1305

From this data one could draw the conclusion that if the number of elements in the array is expected to (in mean) be more than 350 (or say 1000) one should consider to use the dot-product method instead.

Daniel Persson 2009-05-18 12:17:45

9.5 us vs 2.1 us is not exactly "performance suffering".

SilentGhost 2009-05-18 13:20:59

See the added information about array-length comparisons ...

Daniel Persson 2009-05-23 19:46:14

Answer 7

A:

[A[i]+B[i] for i in range(min([len(A), len(B)]))].count(2)

Basically this just creates a new list which has all the elements of the other two added together. You know there were two 1's if the sum is 2 (assuming only 0's and 1's in the list). Therefore just perform the count operation on 2.

Matt 2009-05-18 14:05:20

ansaurus

tags:

views:

answers:

comparing two arrays

related questions