questions about floating-point | ansaurus

floating-point

Confusing return statement

I'm failing to understand exactly what the IF statement is doing, from what I can see it is checking if the variable x is equal to the int 0. If this is true the ABSOLUTE value of the variable y is returned... this is when I lose the plot, why would the return statement then go on to include <= ESPILON? Surely this means less than or equ...

evaluation-order

When to use Fixed Point these days

For intense number-crunching i'm considering using fixed point instead of floating point. Of course it'll matter how many bytes the fixed point type is in size, on what CPU it'll be running on, if i can use (for Intel) the MMX or SSE or whatever new things come up... I'm wondering if these days when floating point runs faster than ever...

Haskell FFI / C MPFR library wrapper woes

In order to create an arbitrary precision floating point / drop in replacement for Double, I'm trying to wrap MPFR using the FFI but despite all my efforts the simplest bit of code doesn't work. It compiles, it runs, but it crashes mockingly after pretending to work for a while. A simple C version of the code happily prints the number "1...

ParseFloat function in JavaScript

When i am adding two text boxes values that are 1.oo1 and 0.001 and then i do a parsefloat i get 1.0019999999. I want it 1.002 . Can u help me? ...

Why do I see a double variable initialized to some value like 21.4 as 21.399999618530273?

double r = 11.631; double theta = 21.4; In the debugger, these are shown as 11.631000000000000 and 21.399999618530273. How can I avoid this? ...

How do you round a floating point number in Perl?

How can I round a decimal number (floating point) to the nearest integer? e.g. 1.2 = 1 1.7 = 2 ...

Adding floats with gmp gives "correct" results, sort of ...

In the code below I use mpf_add to add the string representation of two floating values. What I don't understand at this point is why 2.2 + 3.2 = 5.39999999999999999999999999999999999999. I would have thought that gmp was smart enough to give 5.4. What am I not comprehending about how gmp does floats? (BTW, when I first wrote this I w...

How to avoid rounding problems when comparing currency values in Delphi?

AFAIK, Currency type in Delphi Win32 depends on the processor floating point precision. Because of this I'm having rounding problems when comparing two Currency values, returning different results depending on the machine. For now I'm using the SameValue function passing a Epsilon parameter = 0.009, because I only need 2 decimal digits ...

Stop JAVA and C from truncating my floats and doubles!

When I give JAVA and C BIG floats and doubles (in the billion range), they convert it to scientific notation, losing precision in the process. How can I stop this behavior? ...

Any C++ libraries available to convert between floating point representations?

I recently had a need to interpret a DEC 32-bit floating point representation. It differs from the IEEE floating point representations in the number of bits allocated to the exponent and mantissa. Here's a description of a bunch of floating point formats: http://www.quadibloc.com/comp/cp0201.htm I managed to roll my own C++ code to s...

C# - Is there a 32-bit float math libary?

I'm planning on doing my next project in c# rather than c++ (using SlimDX). All of directX uses floats, however System.Math uses doubles. This means constantly converting between floats and doubles. So idealy id like to write all the code using floats, since i'm not getting any added precision converting to floats from doubles all the ...

double in .net

If I have the following code (this was written in .NET) double i = 0.1 + 0.1 + 0.1; Why doesn't i equal 0.3? Any ideas? ...

Using NaN in C++?

What's the best way to use NaNs in C++? I found std::numeric_limits<double>::quiet_NaN() and std::numeric_limits<double>::signaling_NaN(). I'd like to use signaling_NaN to represent an uninitialized variable as follows: double diameter = std::numeric_limits<double>::signaling_NaN(); This, however, signals (raises an exception) on as...

How to use std::signaling_nan?

After looking at another question on SO (Using NaN in C++) I became curious about std::numeric_limits<double>::signaling_NaN(). I could not get signaling_NaN to throw an exception. I thought perhaps by signaling it really meant a signal so I tried catching SIGFPE but nope... Here is my code: double my_nan = numeric_limits<double>::sig...

Is it correct to compare two rounded floating point numbers using the == operator?

Or is there a chance that the operation will fail? Thanks. I chose the wrong term and what I really meant was rounding to 0, not truncation. The point is, I need to compare the integer part of two doubles and I'm just casting them to int and then using ==, but, as someone pointed out in one of my earlier questions, this could throw an...

How do I round a number in javascript?

Hey all. While working on a project, I came across a js-script created by a former employee that basically creates a report in the form of Name : Value Name2 : Value2 etc... Problem for me though, is that the values can sometimes be floats (with different precision), integers, or even in the form "2.20011E+17" What I outputted thoug...

Using C: How can I determine the sizes of the components of a floating point?

I am looking for suggestions on how to find the sizes (in bits) and range of floating point numbers in an architecture independent manner. The code could be built on various platforms (AIX, Linux, HPUX, VMS, maybe Windoze) using different flags - so results should vary. The sign, I've only seen as one bit, but how to measure the size o...

Can every float be expressed exactly as a double?

Can every possible value of a float variable can be represented exactly in a double variable? In other words, for all possible values X will the following be successful: float f1 = X; double d = f1; float f2 = (float)d; if(f1 == f2) System.out.println("Success!"); else System.out.println("Failure!"); My suspicion is that there i...

how-stuff-works

.NET bug when converting float to integer?

Check it out: this little .NET Console Program yields interesting results...notice how I'm converting a float to an integer in two different ways: using System; using System.Collections.Generic; using System.Linq; using System.Text; namespace CastVsConvert { class Program { static void Main(string[] args) { ...

How can I write a C++ function returning true if a real number is exactly representable with a double?

How can I write a C++ function returning true if a real number is exactly representable with a double? bool isRepresentable( const char* realNumber ) { bool answer = false; // what goes here? return answer; } Simple tests: assert( true==isRepresentable( "0.5" ) ); assert( false==isRepresentable( "0.1" ) ); ...

1
2
3
4
5
...
33