tags:

views:

129

answers:

3
+1  Q: 

double_t in C99

Hi, I just read that C99 has double_t which should be at least as wide as double. Does this imply that it gives more precision digits after the decimal place? More than the usual 15 digits for double?.

Secondly, how to use it: Is only including

#include <float.h> 

enough? I read that one has to set the FLT_EVAL_METHOD to 2 for long double. How to do this? As I work with numerical methods, I would like maximum precision without using an arbitrary precision library.

Thanks a lot...

+1  A: 

Note that you don't get to set FLT_EVAL_METHOD - it's set by the compiler's headers to let you determine how the library does certain things with floating point.

If you're code is very sensitive to exactly how floating point operations are performed, you can use the value of that macro to conditionally compile code to handle those differences that might be important to you.

So for example, in general you know that double_t will be at least a double in all cases. If you want your code to do something different if double_t is a long double then your code can test if FLT_EVAL_METHOD == 2 and act accordingly.

Note that if FLT_EVAL_METHOD is something other than 0, 1, or 2 you'll need to look to the compiler's documentation to know exactly what type double_t is.

Michael Burr
Thanks. So I don't get to set this precision. But still, can I just define a variable of type double_t by just including float.h ?
yCalleecharan
To your second comment, can you please give me a short example so as I understand its implementation...
yCalleecharan
@yCalleecharan: yes, in which case you should get "the implementation’s most efficient types at least as wide as ... double". In other words, something possibly with more precision than `double`, but maybe just a regular old `double`.
Michael Burr
So it can be less wide than long double?
yCalleecharan
+1  A: 

No. double_t is at least as wide as double; i.e., it might be the same as double. Footnote 190 in the C99 standard makes the intent clear:

The types float_t and double_t are intended to be the implementation’s most efficient types at least as wide as float and double, respectively.

As Michael Burr noted, you can't set FLT_EVAL_METHOD.

If you want the widest floating-point type on any system available using only C99, use long double. Just be aware that on some platforms it will be the same as double (and could even be the same as float).

Also, if you "work with numerical methods", you should be aware that for many (most even) numerical methods, the approximation error of the method is vastly larger than the rounding error of double precision, so there's often no benefit to using wider types. Exceptions exist, of course. What type of numerical methods are you working on, specifically?

Edit: seriously, either (a) just use long double and call it a day or (b) take a few weeks to learn about how floating-point is actually implemented on the platforms that you're targeting, and what the actual accuracy requirements are for the algorithms that you're implementing.

Stephen Canon
Thanks. Is long double wider than double_t? This I'm confused about.
yCalleecharan
@yCalleecharan: On some platforms, yes, on some platforms, no. For instance, on OSX/Intel, `double` and `double_t` are IEEE-754 double, whereas `long double` is the x87 double extended type, which has more precision. On a platform where floating point arithmetic is codegen'd to the x87 unit by default (Windows, maybe?), `double_t` might be the same as `long double`. It is unlikely that `double_t` will be *wider* than `long double` (though not impossible, IIRC).
Stephen Canon
yCalleecharan
Thanks for explaining the difference between long double and double_t. I'll stick to long double simply :).
yCalleecharan
@yCalleecharan: With or without an adaptive step size? What's the step size (or what range can it lie in)?
Stephen Canon
No, without adaptive size. Step size is around 10000 s/2500000 = 0.004 s. It seems sufficient in my case.
yCalleecharan
@yCalleecharan: Without knowing exactly what equations (and their scaling) you're working with, I can't say for sure, but with such a large step size, `double` generally will deliver completely satisfactory answers (because the local truncation error from the method will be much larger than the round-off error from the numerics), and will be rather faster than `long double` on some platforms.
Stephen Canon
Yes it's quite a large size. Such a program takes 30 mins or so to run on my machine as I have to vary one parameter 300 times. The problem is that I had to increase the simulation time to 10000 s so as to eliminate all transients. But I can decrease the step size by a factor of 10 further if I increase 10 times my array sizes and I haven't tested if this would be problem in C.
yCalleecharan
A: 

double_t may be defined by typedef double double_t; — of course, if you plan to rely on implementation specifics, you need to look at your own implementation.

Potatoswatter
Thanks for telling how to implement it. But then do I need to include float.h ?
yCalleecharan