ansaurus

Question

How to write portable floating point arithmetic in c++ ?

Answer 1

+2 A:

It shouldn't be an issue, IEEE 754 already defines all details of the layout of floats.

The maximum and minimum values storable should be defined in limits.h

Martin Beckett 2009-06-11 17:29:07

I don't think that IEEE 754 is guaranteed in C++.

Éric Malenfant 2009-06-11 17:41:19

No, but can you imagine any reasonable OS that supported C++ but didn't use it!

Martin Beckett 2009-06-11 18:16:10

While I certainly can imagine one, I have to admit I don't know any modern one :)

Éric Malenfant 2009-06-11 19:02:57

Many embedded and signal processing platforms only offer fixed point arithmetics.

peterchen 2009-06-11 19:18:19

I was just thinking that when I wrote 'reasonable'! It's interesting how many embedded systems support c++, although without exceptions, RTTI and the STL it's not really C++.

Martin Beckett 2009-06-11 20:23:15

Answer 2

A:

I believe "limits.h" will include the C library constants INT_MAX and its brethren. However, it is preferable to use "limits" and the classes it defines:

std::numeric_limits<float>, std::numeric_limits<double>, std::numberic_limits<int>, etc...

Craig W. Wright 2009-06-11 17:36:08

Answer 3

+3 A:

Use fixed point.

However, if you want to approach the realm of possibly making portable floating point operations, you at least need to use controlfp to ensure consistent FPU behavior as well as ensuring that the compiler enforces ANSI conformance with respect to floating point operations. Why ANSI? Because it's a standard.

And even then you aren't guaranteeing that you can generate identical floating point behavior; that also depends on the CPU/FPU you are running on.

MSN 2009-06-11 17:39:57

Answer 4

+2 A:

Portable is one thing, generating consistent results on different platforms is another. Depending on what you are trying to do then writing portable code shouldn't be too difficult, but getting consistent results on ANY platform is practically impossible.

Dolphin 2009-06-11 17:47:58

Answer 5

+4 A:

Non-IEEE 754:

Generally, you cannot. There's always a tradeof between consistency and performance, and C++ hands that to you.

For platforms that don't have floating point operations (like embedded and signal processing processors), you cannot use C++ "native" floating point operations, at least not portably so. While a software layer would be possible, that's certainly not feasible for this type of devices.

For these, you could use 16 bit or 32 bit fixed point arithmetic (but you might even discover that long is supported only rudimentary - and frequently, div is very expensive). However, this will be much slower than built-in fixed-point arithmetic, and becomes painful after the basic four operations.

I haven't come across devices that support floating point in a different format than IEEE 754. From my experience, your best bet is to hope for the standard, because otherwise you usually end up building algorithms and code around the capabilities of the device. When sin(x) suddenly costs 1000 times as much, you better pick an algorithm that doesn't need it.

IEEE 754 - Consistency

The only non-portability I found here is when you expect bit-identical results across platforms. The biggest influence is the optimizer. Again, you can trade accuracy and speed for consistency. Most compilers have a option for that - e.g. "floating point consistency" in Visual C++. But note that this is alway accuracy beyond the guarantees of the standard.

Why results become inconsistent? First, FPU registers often have higher resolution than double's (e.g. 80 bit), so as long as the code generator doesn't store the value back, intermediate values are held with higher accuracy.

Second, the equivalences like a*(b+c) = a*b + a*c are not exact due to the limited precision. Nonetheless the optimizer, if allowed, may make use of them.

Also - what I learned the hard way - printing and parsing functions are not necessarily consistent across platforms, probably due to numeric inaccuracies, too.

float

It is a common misconception that float oeprations are intrinsically faster than double. working on large float arrays is faster usually through less cache misses alone.

Be careful with float accuracy. it can be "good enough" for a long time, but I've often seen it fail faster than expected. Float-based FFT's can be much faster due to SIMD support, but generate notable artifacts quite early for audio processing.

peterchen 2009-06-11 18:19:19

Also a lot of graphics cards work in double, you can have dramatic slowdowns when it has to convert float or worse drops into some OpenGL compatibility mode.

Martin Beckett 2009-06-11 20:24:30

Answer 6

A:

If you're assuming that you will get the same results on another system, read http://stackoverflow.com/questions/968435/ first. You might be surprised to learn that your floating point arithmetic isn't even the same across different runs on the very same machine!

MSalters 2009-06-12 09:25:03

ansaurus

tags:

views:

answers:

How to write portable floating point arithmetic in c++ ?

related questions