ansaurus

Question

Answer 1

+2 A:

The standard list switched over to doing arithmetic with the long type when numbers got larger than a 32-bit int.

The numpy array did not switch to long, and suffered from integer overflow. The price for speed is smaller range of values allowed.

>>> 499999500000 % 2**32
1783293664L

S.Lott 2009-08-18 19:38:08

Answer 2

+2 A:

Numpy is creating an array of 32-bit unsigned ints. When it sums them, it sums them into a 32-bit value.

if 499999500000L % (2**32) == 1783293664L:
    print "Overflowed a 32-bit integer"

You can explicitly choose the data type at array creation time:

a = numpy.arange(1000000, dtype=numpy.uint64)
a.sum() -> 499999500000

Joe Koberg 2009-08-18 19:40:03

Answer 3

+3 A:

Notice that 499999500000 % 2**32 equals exactly 1783293664 ... i.e., numpy is doing operations modulo 2**32, because that's the type of the numpy.array you've told it to use.

Make np_array = numpy.arange(1000000, dtype=numpy.uint64), for example, and your sum will come out OK (although of course there are still limits, with any finite-size number type).

You can use dtype=numpy.object to tell numpy that the array holds generic Python objects; of course, performance will decay as generality increases.

Alex Martelli 2009-08-18 19:43:12

ansaurus

tags:

views:

answers:

Numpy: Is there an array size limit?

related questions