views:

259

answers:

6

I've encountered an annoying problem in outputting a floating point number. When I format 11.545 with a precision of 2 decimal points on Windows it outputs "11.55", as I would expect. However, when I do the same on Linux the output is "11.54"!

I originally encountered the problem in Python, but further investigation showed that the difference is in the underlying C runtime library. (The architecture is x86-x64 in both cases.) Running the following line of C produces the different results on Windows and Linux, same as it does in Python.

printf("%.2f", 11.545);

To shed more light on this I printed the number to 20 decimal places ("%.20f"):

Windows: 11.54500000000000000000
Linux:   11.54499999999999992895

I know that 11.545 cannot be stored precisely as a binary number. So what appears to be happening is that Linux outputs the number it's actually stored with the best possible precision, while Windows outputs the simplest decimal representation of it, ie. tries to guess what the user most likely meant.

My question is: is there any (reasonable) way to emulate the Linux behaviour on Windows?

(While the Windows behaviour is certainly the intuitive one, in my case I actually need to compare the output of a Windows program with that of a Linux program and the Windows one is the only one I can change. By the way, I tried to look at the Windows source of printf, but the actual function that does the float->string conversion is _cfltcvt_l and its source doesn't appear to be available.)

EDIT: the plot thickens! The theory about this being caused by an imprecise representation might be wrong, because 0.125 does have an exact binary representation and it's still different when output with '%.2f' % 0.125:

Windows: 0.13
Linux:   0.12

However, round(0.125, 2) returns 0.13 on both Windows and Linux.

A: 

Consider comparing floating point numbers with some tolerance/epsilon instead. This is much more robust than trying to match exactly.

What I mean is, except saying that two floats are equal when:

f1 == f2

Say they're equal when:

fabs(f1 - f2) < eps

For some small eps. More details on this issue can be found here.

Eli Bendersky
Yes, I would normally do that, but in this case I'm comparing strings, not floats directly. The strings are part of a much larger program output. Besides, if they're rounded to 2 places I cannot realistically use an eps of 0.02 - that would conceal real differences.
Evgeny
A: 

You could try subtracting (or adding for a negative number) a small delta that will have no effect on the rounding for numbers far enough away from the precision.

For example, if you're rounding with %.2f, try this version on Windows:

printf("%.2f", 11.545 - 0.001);

Floating point numbers are notoriously problematic if you don't know what's happening under the covers. In that case, your best bet is to write (or use) a decimal type library to alleviate the problems.


The example program:

#include <stdio.h>
int main (void) {
    printf("%.20f\n", 11.545);
    printf("%.2f\n", 11.545);
    printf("%.2f\n", 11.545 + 0.001);
    return 0;
}

outputs this in my Cygwin environment:

11.54499999999999992895
11.54
11.55

which is okay for your specific case (it's going the wrong way but should hopefully apply in the other direction as well: you need to test it) but you should check your entire possible input range if you want to be certain this will work for all your cases.


Update:

Evgeny, based on your comment:

It works for this specific case, but not as a general solution. For instance if the number I want to format is 0.545 instead of 11.545 then '%.2f' % (0.545 - 0.001) returns "0.54", while '%.2f' % 0.545 on Linux correctly returns "0.55".

that's why I said you would have to check the entire range to see if it would work, and why I stated a decimal data type would be preferable.

If you want decimal accuracy, that's what you'll have to do. But you might want to consider the cases in that range where Linux goes the other way too (as per your comment) - there may be situation where Linux and Windows disagree in the opposite direction to what you've found - a decimal type probably won't solve that.

You may need to make your comparison tools a little more intelligent inasmuch as they can ignore a difference of 1 in the final fractional place.

paxdiablo
It works for this specific case, but not as a general solution. For instance if the number I want to format is 0.545 instead of 11.545 then `'%.2f' % (0.545 - 0.001)` returns "0.54", while `'%.2f' % 0.545` on Linux correctly returns "0.55".
Evgeny
+1  A: 

The decimal module gives you access to several rounding modes:

import decimal

fs = ['11.544','11.545','11.546']

def convert(f,nd):
    # we want 'nd' beyond the dec point
    nd = f.find('.') + nd
    c1 = decimal.getcontext().copy()
    c1.rounding = decimal.ROUND_HALF_UP
    c1.prec = nd
    d1 = c1.create_decimal(f)
    c2 = decimal.getcontext().copy()
    c2.rounding = decimal.ROUND_HALF_DOWN
    c2.prec = nd   
    d2 = c2.create_decimal(f)
    print d1, d2

for f in fs:
    convert(f,2)

You can construct a decimal from an int or a string. In your case feed it a string with more digits than you want and truncate by setting context.prec.

Here is a link to a pymotw post w/ a detailed overview of the decimal module:

http://broadcast.oreilly.com/2009/08/pymotw-decimal---fixed-and-flo.html

Cyrus
Yes, it does produce 11.55 on Linux, but I cannot edit the Linux program, only the Windows one. So I need to get the Windows one to produce 11.54 instead.
Evgeny
Sorry for misreading - I updated the solution with info on the decimal module. Hopefully this will give you the rounding control to duplicate the linux behavior on windows.
Cyrus
Good idea, but, same as paxdiablo's answer it doesn't pass the "0.545" test - that rounds down to "0.54" when the Linux program would output "0.55".
Evgeny
Cyrus
Yes, it does print out what you say, but unfortunately what I'm after is far more complex than that. Linux doesn't just always round down. I tried to explain in the original post what I *think* it does.
Evgeny
Cyrus
+2  A: 

First of all it sounds like Windows has it wrong right in this case (not that this really matters). The C Standard requires that the value output by %.2f is rounded to the appropriate number of digits. The best known algorithm for this is dtoa implemented by David M. Gay. You can probably port this to Windows or find a native implementation.

If you haven't already read "How to Print Floating-Point Numbers Accurately" by Steele and White, find a copy and read it. It is definitely an enlightening read. Make sure to find the original from the late 70's. I think that I purchased mine from ACM or IEEE at some point.

D.Shawley
No, both implementations are perfectly valid. ISO 754 permits several rounding modes; Linux uses round-to-even, and Linux uses round-away-from-0.
Glenn Maynard
You may be right, Glenn - `round()` doesn't do this, but `printf()` does! `round()` seems to be consistent between Windows and Linux.
Evgeny
The C `round()` function always rounds away from zero so I would expect it to be consistent.
D.Shawley
A: 

You may be able to do subtract a tiny amount from the value to force the rounding down

print "%.2f"%(11.545-1e-12)
gnibbler
+2  A: 

I don't think Windows is doing anything especially clever (like trying to reinterpret the float in base 10) here: I'd guess that it's simply computing the first 17 significant digits accurately (which would give '11.545000000000000') and then tacking extra zeros on the end to make up the requested number of places after the point.

As others have said, the different results for 0.125 come from Windows using round-half-up and Linux using round-half-to-even.

Note that for Python 3.1 (and Python 2.7, when it appears), the result of formatting a float will be platform independent (except possibly on unusual platforms).

Mark Dickinson
I don't think it's just tacking on extra zeros, because the number is not actually 11.5450000... even to 17 digits. But +1 for the info about Python 3.1 and 2.7 - looking forward to that!
Evgeny
The 'round to 17 significant figures (not 17 places after the point!) and tack on extra zeros' formula seems to work for me, for this particular number. The first 17 digits look like 11.54499..; the 18th is also a 9, so we round up.The guess was motivated by the original 1985 version of IEEE 754: in section 5.6 (Binary <---> Decimal conversions), you'll find the text "the implementor may, at his option, alter allsignificant digits after the ninth for single and the seventeenth for double to other decimal digits, typically 0." I'm guessing that Microsoft chose to exercise this option. :)
Mark Dickinson