tags:

views:

225

answers:

3

Hi,

Can anyone explain the following? I'm using Python 2.5

Consider 1*3*5*7*9*11 ... *49. If you type all that from within IPython(x,y) interactive console, you'll get 58435841445947272053455474390625L, which is correct. (why odd numbers: just the way I did it originally)

Python multiply.reduce() or prod() should yield the same result for the equivalent range. And it does, up to a certain point. Here, it is already wrong:

: k = range(1, 50, 2)
: multiply.reduce(k)
: -108792223

Using prod(k) will also generate -108792223 as the result. Other incorrect results start to appear for equivalent ranges of length 12 (that is, k = range(1,24,2)).

I'm not sure why. Can anyone help?

A: 

arrays or lists make no difference for prod() or multiply.reduce(), correct and incorrect results happen the same way. Could this be a bug?

Alex
+2  A: 

The CPU doesn't really multiply numbers, it only performs operations its been taught on 0-1 bits.

Python '*' handles large integers perfectly through a proper representation and special code beyond the CPU or FPU instructions for multiply.

This is actually unusual as languages go.

In most other languages, usually a number is represented as a fixed array of bits. For example in C or SQL you could choose to have an 8 bit integer that can represent 0 to 255, or -128 to +127 or you could choose to have a 16 bit integer that can represent up to 2^16-1 which is 65535. When there is only a range of numbers that can be represented, going past the limit with some operation like * or + can have an unpredictable effect, like getting a negative number. You may have encountered such a problem when using the external library which is probably natively C and not python.

Paul
While there are some things you say that might be considered useful and helpful to some people, the claim that a "CPU doesn't really multiply numbers" is pretty bogus. In fact, very few CPUs do NOT have a multiply instruction that works on their native integer sizes (generally 32- or 64-bit for desktop machines running Python). I think you might have meant to say that CPUs don't handle *large* integer multiplication directly, but that's not what you said.
Peter Hansen
@Peter I said number. Taking Wikipedia's definition of a Number instead of our own personal views of what the word might mean -- http://en.wikipedia.org/wiki/Number -- I'll stand behind my reasoning that the CPUs MUL instruction only attempts to work on specific ranges of numbers that fit into an industry defined spec and even then will not provide the products of any two numbers that fit in the spec.
Paul
+4  A: 

This is because numpy.multiply.reduce() converts the range list to an array of type numpy.int32, and the reduce operation overflows what can be stored in 32 bits at some point:

>>> type(numpy.multiply.reduce(range(1, 50, 2)))
<type 'numpy.int32'>

As Mike Graham says, you can use the dtype parameter to use Python integers instead of the default:

>>> res = numpy.multiply.reduce(range(1, 50, 2), dtype=object)
>>> res
58435841445947272053455474390625L
>>> type(res)
<type 'long'>

But using numpy to work with python objects is pointless in this case, the best solution is KennyTM's:

>>> import functools, operator
>>> functools.reduce(operator.mul, range(1, 50, 2))
58435841445947272053455474390625L
Luper Rouch
To use a different data type, you can also add a dtype argument to some NumPy functions, e.g. np.prod([1,2], dtype=np.int64); however, int64 is still not big enough to compute the problem presented here! I don't think NumPy has unlimited-precision integers, unfortunately.
Johannes Sasongko
Yes, numpy is not the good tool to work with arbitrary precision integers, the correct way is to use plain Python as KennyTM said.
Luper Rouch
You can use `dtype=object` to use Python objects with numpy if you need arrays of bignums (which one usually does not!).
Mike Graham