questions about floating-point | ansaurus

floating-point

How to efficiently compare the sign of two floating-point values while handling negative zeros

Given two floating-point numbers, I'm looking for an efficient way to check if they have the same sign, given that if any of the two values is zero (+0.0 or -0.0), they should be considered to have the same sign. For instance, SameSign(1.0, 2.0) should return true SameSign(-1.0, -2.0) should return true SameSign(-1.0, 2.0) should ret...

Floating Point Arithmetic - Modulo Operator on Double Type

So I'm trying to figure out why the modulo operator is returning such a large unusual value. If I have the code: double result = 1.0d % 0.1d; it will give a result of 0.09999999999999995. I would expect a value of 0 Note this problem doesn't exist using the dividing operator - double result = 1.0d / 0.1d; will give a result of 1...

accuracy-problems

C# - Incrementing a double value (1.212E+25)

I have a double value that equals 1.212E+25 When I throw it out to text I do myVar.ToString("0000000000000000000000") The problem is even if I do myVar++ 3 or 4 times the value seems to stay the same. Why is that? ...

Why does 99.99 / 100 = 0.9998999999999999

Possible Duplicate: Dealing with accuracy problems in floating-point numbers Whereas 99.99 * 0.01 = 0.99 Clearly this is the age old floating point rounding issue, however the rounding error in this case seems quite large to me; what I mean is I might have expected a result of 0.99990000001 or some similar 'close' result. An...

floating-accuracy

SQL Server float datatype

The documentation for SQL Server Float says Approximate-number data types for use with floating point numeric data. Floating point data is approximate; therefore, not all values in the data type range can be represented exactly. Which is what I expected it to say. If that is the case though why does the following return 'Ye...

Convert ieee 754 float to hex with c - printf

Ideally the following code would take a float in IEEE 754 representation and convert it into hexadecimal void convert() //gets the float input from user and turns it into hexadecimal { float f; printf("Enter float: "); scanf("%f", &f); printf("hex is %x", f); } I'm not too sure what's going wrong. It's converting the...

Can I make gcc tell me when a calculation results in NaN or inf at runtime?

Is there a way to tell gcc to throw a SIGFPE or something similar in response to a calculation that results in NaN or (-)inf at runtime, like it would for a divide-by-zero? I've tried the -fsignaling-nans flag, which doesn't seem to help. ...

CUDA: accumulate data into a large histogram of floats

I'm trying to think of a way to implement the following algorithm using CUDA: Working on a large volume of voxels, for each voxel I calculate an index i and a value c. after the calculation I need to perform histogram[i] += c c is a float value and the histogram can have up to 15,000 bins. I'm looking for a way to implement this effici...

is memset(ary,0,length) a portable way of inputting zero in double array

Possible Duplicate: What is faster/prefered memset or for loop to zero out an array of doubles The following code uses memset to set all the bits to zero int length = 5; double *array = (double *) malloc(sizeof(double)*length); memset(array,0,sizeof(double)*length); for(int i=0;i<length;i++) if(array[i]!=0.0) fprintf(st...

Can bad stuff happen when dividing 1/a very small float?

If I want to check that positive float A is less than the inverse square of another positive float B (in C99), could something go wrong if B is very small? I could imagine checking it like if(A<1/(B*B)) but if B is small enough, would this possibly result in infinity? If that were to happen, would the code still work correctly in al...

MSVC win32: convert extended precision float (80-bit) to double (64-bit)

What is the most portable and "right" way to do conversion from extended precision float (80-bit value, also known as "long double" in some compilers) to double (64-bit) in MSVC win32/win64? MSVC currently (as of 2010) assumes that "long double" is "double" synonym. I could probably write fld/fstp assembler pair in inline asm, but inli...

How many double numbers are there between 0.0 and 1.0?

This is something that's been on my mind for years, but I never took the time to ask before. Many (pseudo) random number generators generate a random number between 0.0 and 1.0. Mathematically there are infinite numbers in this range, but double is a floating point number, and therefore has a finite precision. So the questions are: J...

python floating number

i am kind of confused why python add some additional decimal number in this case, please help to explain >>> mylist = ["list item 1", 2, 3.14] >>> print mylist ['list item 1', 2, 3.1400000000000001] ...

floating-accuracy

roundig up to a certain floating point- matlab

hi. im new in MATLAB. i think its a simple question. i want: a=1.154648126486416 to become a=1.154 and not a=1.54000000000 how do i do that without useing format('bank'). thanks. ...

How should I deal with floating numbers that numbers that can get so small that the become zero

So I just fixed an interesting bug in the following code, but I'm not sure the approach I took it the best: p = 1 probabilities = [ ... ] # a (possibly) long list of numbers between 0 and 1 for wp in probabilities: if (wp > 0): p *= wp # Take the natural log, this crashes when 'probabilites' is long enough that p ends up # being...

numerical-methods

How do I verify/validate a float before storing as decimal(4,1) in SQL?

I have defined a column in SQL to be decimal(4,1), null which means I am storing four digits, up to one of which can be to the right of the decimal point. The source data should always be in the range of 0 to 999.9, but due to an error beyond my control, I received the number -38591844.0. Obviously this won't store in SQL and gives the ...

use vbscript to get 32 bit floating point into binary byte or word representation

Hello, I don't know how to do two somewhat related task within vbscript (not vb) -I need to break a 32 bit floating point into it's 4 byte binary representation. -I need to break a 32 bit floating point into it's 2 word(aka16bit) binary representation. For example, 65535.0 in format binary is 1000111011111111111111100000000 65535.0 in ...

Float32 to Float16

Can someone explain to me how I convert a 32-bit floating point value to a 16-bit floating point value? (s = sign e = exponent and m = mantissa) If 32-bit float is 1s7e24m And 16-bit float is 1s5e10m Then is it as simple as doing? int fltInt32; short fltInt16; memcpy( &fltInt32, &flt, sizeof( float ) ); fltInt16 = (fltInt32 &...

How to convert an integer to a floating point value in x86 ASM?

I need to multiply an integer (two's compliment) by a floating point constant. Here is what I have: .data pi dd 3.14 int dd 0ah .code fld pi ??? fmul ST(1), ST How can I convert int to a floating point value for multiplying against pi? ...

Double type returns -1.#IND/NaN error when calculating pi iteratively

I am working through a problem for my MCTS certification. The program has to calculate pi until the user presses a key, at which point the thread is aborted, the result returned to the main thread and printed in the console. Simple enough, right? This exercise is really meant to be about threading, but I'm running into another problem. T...

1
...
23
24
25
26
27
...
33