tags:

views:

61

answers:

2
import numpy as np
a = np.arange(1000000).reshape(1000,1000)
print(a**2)

With this code I get this answer. Why do I get negative values?

[[         0          1          4 ...,     994009     996004     998001]
 [   1000000    1002001    1004004 ...,    3988009    3992004    3996001]
 [   4000000    4004001    4008004 ...,    8982009    8988004    8994001]
 ..., 
 [1871554624 1873548625 1875542628 ..., -434400663 -432404668 -430408671]
 [-428412672 -426416671 -424420668 ..., 1562593337 1564591332 1566589329]
 [1568587328 1570585329 1572583332 ..., -733379959 -731379964 -729379967]]
+8  A: 

np.arange returns an array of dtype 'int32' :

In [1]: np.arange(1000000).dtype
Out[1]: dtype('int32')

Each element of the array is a 32-bit integer. Squaring leads to a result which does not fit in 32-bits. The result is cropped to 32-bits and still interpreted as a 32-bit integer, however, which is why you see negative numbers.

Edit: In this case, you can avoid the integer overflow by constructing an array of dtype 'int64' before squaring:

a=np.arange(1000000,dtype='int64').reshape(1000,1000)

Note that the problem you've discovered is an inherent danger when working with numpy. You have to choose your dtypes with care and know before-hand that your code will not lead to arithmetic overflows. For the sake of speed, numpy can not and will not warn you when this occurs.

See http://mail.scipy.org/pipermail/numpy-discussion/2009-April/041691.html for a discussion of this on the numpy mailing list.

unutbu
+1  A: 

numpy integer types are fixed width and you are seeing the results of integer overflow.

GregS