views:

782

answers:

7

I work daily with Python 2.4 at my company. I used the versatile logarithm function 'log' from the standard math library, and when I entered log(2**31, 2) it returned 31.000000000000004, which struck me as a bit odd.

I did the same thing with other powers of 2, and it worked perfectly. I ran 'log10(2**31) / log10(2)' and I got a round 31.0

I tried running the same original function in Python 3.0.1, assuming that it was fixed in a more advanced version.

Why does this happen? Is it possible that there are some inaccuracies in mathematical functions in Python?

+38  A: 

This is to be expected with computer arithmetic. It is following particular rules, such as IEEE 754, that probably don't match the math you learned in school.

If this actually matters, use Python's decimal type.

Example:

from decimal import Decimal, Context
ctx = Context(prec=20)
two = Decimal(2)
ctx.divide(ctx.power(two, Decimal(31)).ln(ctx), two.ln(ctx))
Matthew Flaschen
+1 for a nice answer, but mostly for "If this actually matters." Probes were flown to Saturn with less precision.
dwc
Indeed. The italics is the most important part of the answer.
Matthew Flaschen
+16  A: 

Always assume that floating point operations will have some error in them and check for equality taking that error into account (either a percentage value like 0.00001% or a fixed value like 0.00000000001). This inaccuracy is a given since not all decimal numbers can be represented in binary with a fixed number of bits precision.

Your particular case is not one of them if Python uses IEEE754 since 31 should be easily representable with even single precision. It's possible however that it loses precision in one of the many steps it takes to calculate log2231, simply because it doesn't have code to detect special cases like a direct power of two.

paxdiablo
+1 I think you have a handle on it with your last sentence.
Matthew Flaschen
Very interesting. I write code for a very long, and that's the first I have come across this phenomenon. But after the response here, I think now I begin to see the bigger picture of why it happens.
Avihu Turzion
+5  A: 

floating-point operations are never exact. They return a result which has an acceptable relative error, for the language/hardware infrastructure.

In general, it's quite wrong to assume that floating-point operations are precise, especially with single-precision. "Accuracy problems" section from Wikipedia Floating point article :)

NicDumZ
+2  A: 

This is normal. I would expect log10 to be more accurate then log(x, y), since it knows exactly what the base of the logarithm is, also there may be some hardware support for calculating base-10 logarithms.

quant_dev
+17  A: 

You should read "What Every Computer Scientist Should Know About Floating-Point Arithmetic".

http://docs.sun.com/source/806-3568/ncg_goldberg.html

bayer
Thank you very much. That's a mighty reference.
Avihu Turzion
+1 for an excellent reference
Brian Agnew
+3  A: 

IEEE double floating point numbers have 52 bits of precision. Since 10^15 < 2^52 < 10^16, a double has between 15 and 16 significant figures. The result 31.000000000000004 is correct to 16 figures, so it is as good as you can expect.

John D. Cook
+1  A: 

The representation (float.__repr__) of a number in python tries to return a string of digits as close to the real value as possible when converted back, given that IEEE-754 arithmetic is precise up to a limit. In any case, if you printed the result, you wouldn't notice:

>>> from math import log
>>> log(2**31,2)
31.000000000000004
>>> print log(2**31,2)
31.0

print converts its arguments to strings (in this case, through the float.__str__ method), which caters for the inaccuracy by displaying less digits:

>>> log(1000000,2)
19.931568569324174
>>> print log(1000000,2)
19.9315685693
>>> 1.0/10
0.10000000000000001
>>> print 1.0/10
0.1

usuallyuseless' answer is very useful, actually :)

ΤΖΩΤΖΙΟΥ