questions about floating-point | ansaurus

floating-point

How to fix this problem in PHP?

$onethird = 1.0/3; $fivethirds = 1.0/3+1.0/3+1.0/3+1.0/3+1.0/3; $half = 1.0/2; $threehalf = 1.0/2+1.0/2+1.0/2; var_dump($onethird + $fivethirds == $half + $threehalf); which outputs false,but as we all know:5/3+1/3=2=3/2+1/2 How to fix this problem? ...

Is (1 + sqrt(2))^2 = 3 + 2*sqrt(2) satisfied in Floating Point arithmetics?

In mathematics the identity (1 + sqrt(2))^2 = 3 + 2*sqrt(2) holds true. But in floating point (IEEE 754, using single precision i.e. 32 bits) calculations it's not the case, as sqrt(2) doesn't have an exact representation in binary. So does using a approximated value of sqrt(2) provide different results for left and right hand sides? I...

floating-accuracy

Why does this subtraction not equal zero?

I happened upon these values in my ColdFusion code but the Google calculator seems to have the same "bug" where the difference is non-zero. 416582.2850 - 411476.8100 - 5105.475 = -2.36468622461E-011 http://www.google.com/search?hl=en&rlz=1C1GGLS_enUS340US340&q=416582.2850+-+411476.8100+-+5105.475&aq=f&oq=&aqi= Java...

Float versus Integer arithmetic performance on modern chips

Consider a Viterbi decoder on an additive model. It spends its time doing additions and comparisons. Now, consider two: one with C/C++ float as the data type, and another with int. On modern chips, would you expect int to run significantly faster than float? Or will the wonders of pipelining (and the absence of multiplication and divisio...

Converting double to float without relying on the FPU rounding mode

Does anyone have handy the snippets of code to convert an IEEE 754 double to the immediately inferior (resp. superior) float, without changing or assuming anything about the FPU's current rounding mode? Note: this constraint probably implies not using the FPU at all. I expect the simplest way to do it in these conditions is to read the...

bit-manipulation

double precision C++

I think the precision of double is causing that problem, as it was described in similiar posts, but I would like to know if there is a way to achieve correct result. I'm using function template which compares two parameters and returns true if they are equal. template <class T> bool eq(T one, T two) { if (one == two) return true; ...

getfloat returning 23.7999

#define MAXBUF 1000 int buf[MAXBUF]; int buffered = 0; int bufp = 0; int getch() { if(bufp > 0) { if(!--bufp) buffered = 0; return buf[bufp]; } else { buffered = 0; return getchar(); } } void ungetch(int c) { buf[bufp++] = c; buffered = 1; } int getfloat(float *pn) { ...

float is getting mangled when passing between methods (typecasting problem?)

I'm having trouble passing a float value from one object to another. It appears to be fine in the first method, but in the second its value is huge. I assume this is some kind of a problem with my typecasting, because that's the thing I understand the poorest. Help is greatly appreciated! In my game controller, I do this: float accurac...

Why does ghci say that 1.1 + 1.1 + 1.1 > 3.3 is True?

I've been going through a Haskell tutorial recently and noticed this behaviour when trying some simple Haskell expressions in the interactive ghci shell: Prelude> 1.1 + 1.1 == 2.2 True Prelude> 1.1 + 1.1 + 1.1 == 3.3 False Prelude> 1.1 + 1.1 + 1.1 > 3.3 True Prelude> 1.1 + 1.1 + 1.1 3.3000000000000003 Does anybody know why that is? ...

What happens in C++ when an integer type is cast to a floating point type or vice-versa?

Do the underlying bits just get "reinterpreted" as a floating point value? Or is there a run-time conversion to produce the nearest floating point value? Is endianness a factor on any platforms (i.e., endianness of floats differs from ints)? How do different width types behave (e.g., int to float vs. int to double)? What does the l...

Floating Point Math Execution Time

What accounts for the added execution time of the first data set? The assembly instructions are the same. With DN_FLUSH flag not on, the first data set takes 63 milliseconds, the second set takes 15 milliseconds. With DN_FLUSH flag on, the first data set takes 15 milliseconds, the second set takes ~0 milliseconds. Therefore, in both...

Library for strict floating point Math in .NET

I have algorithm/computation in Java and unit test for it. The unit test expects result with some precision/delta. Now I ported the algo into .NET and would like to use same unit test. I work with double data type. The problem is that Java uses strictfp (64bits) for some operations in Math class. Where as .NET uses FPU/CPU always (80 ...

iPhone Thumb & VFP

How do 'compile for Thumb' and VFP code relate to each other? on the iPhone 2G/3G, i know that the Thumb instructionset doesn't include floatingpoint calculations (on the 3GS, Thumb2 aparently has..). So what happens, if one compiles for thumb, but uses VFP code for floatingpoint calculations? i know that's pretty indepth - but probabl...

Why is Math.sqrt(i*i).floor == i?

I am wondering if this is true: When I take the square root of a squared integer, like in f = Math.sqrt(123*123) I will get a floating point number very close to 123. Due to floating point representation precision, this could be something like 122.99999999999999999999 or 123.000000000000000000001. Since floor(122.999999999999999999) ...

numeric-precision

Are there compilers that optimise floating point operations for accuracy (as opposed to speed)?

We know that compilers are getting better and better at optimising our code and make it run faster, but my question are there compilers that can optimise floating point operations to ensure greater accuracy. For example a basic rule is to perform multiplications before addition, this is because multiplication and division using floating...

floating-accuracy

floating and integers....?

why do we need integers and floating in processor?thank you ...

Using C/C++ to efficiently de-serialize a string comprised of floats, tokens and blank lines

I have large strings that resemble the following... some_text_token 24.325973 -20.638823 -1.964366 0.753947 -1.290811 -3.547422 0.813014 -3.547227 0.472015 3.723311 -0.719116 3.676793 other_text_token 24.325973 20.638823 -1.964366 0.753947 -1.290811 -3.547422 -1.996611 -2.877422 0.813014 -3.547227 1.63236...

Why don't I get zero when I subtract the same floating point number from itself in Perl?

Possible Duplicates: Why is floating point arithmetic in C# imprecise? Why does ghci say that 1.1 + 1.1 + 1.1 > 3.3 is True? #!/usr/bin/perl $l1 = "0+0.590580+0.583742+0.579787+0.564928+0.504538+0.459805+0.433273+0.384211+0.3035810"; $l2 = "0+0.590580+0.583742+0.579788+0.564928+0.504538+0.459805+0.433272+0.384211+0.3035810"; $...

floating-accuracy

Check if a double is evenly divisible by another double in C?

How can I check if a double x is evenly divisible by another double y in C? With integers I would just use modulo, but what would be the correct/best way to do it with doubles? I know floating point numbers carry with them imprecision, but I'm getting the double from standard input. Maybe I should not scan it as a double straight away b...

flush-to-zero behavior in floating-point arithmetic

While, as far as I remember, IEEE 754 says nothing about a flush-to-zero mode to handle denormalized numbers faster, some architectures offer this mode (e.g. http://docs.sun.com/source/806-3568/ncg_lib.html ). In the particular case of this technical documentation, standard handling of denormalized numbers is the default, and flush-to-z...

1
...
14
15
16
17
18
...
33