Is there any advantage to using numpy when you're doing a large number of operations on lists of binary values? How about integers within a small range (like just the numbers 1,2, and 3?)
views:
56answers:
2
+1
A:
If the number of input values is huge, or if you are doing a lot of operations, you might want to try bitarray. Or, see the bool
/int8
/uint8
dtype in Numpy's ndarray:
In [1]: import numpy as np
In [2]: data = np.array([0,1,1,0], dtype=bool)
In [3]: data
Out[3]: array([False, True, True, False], dtype=bool)
In [4]: data.size
Out[4]: 4
In [5]: data.nbytes
Out[5]: 4
Alok
2010-02-20 04:24:44
I have found bitarray to be kind of slow sometimes.
Justin Peel
2010-02-20 05:16:40
It could be - I have only heard and read about it, but from the description, it seems like it should be fast. Is it slower than Python lists?
Alok
2010-02-20 08:02:49
+3
A:
Eliminating the loops is the the source of the performance gain (10x):
import profile
import numpy as NP
def np_test(a2darray) :
row_sums = NP.sum(a2darray, axis=1)
return NP.sum(row_sums)
def stdlib_test2(a2dlist) :
return sum([sum(row) for row in a2dlist])
A = NP.random.randint(1, 6, 1e7).reshape(1e4, 1e3)
B = NP.ndarray.tolist(A)
profile.run("np_test(A)")
profile.run("stdlib_test2(B)")
numpy:
- 10 function calls in 0.025 CPU seconds
lists:
- 10005 function calls in 0.280 CPU seconds
doug
2010-02-20 12:40:18