views:

1013

answers:

1

I want to round floats with a bias, either always down or always up. There is a specific point in the code where I need this, the rest of the program should round to the nearest value as usual.

For example, I want to round to the nearest multiple of 1/10. The closest floating point number to 7/10 is approximately 0.69999998807, but the closest number to 8/10 is approximately 0.80000001192. When I round off numbers, these are the two results I get. I'd rather get them rounded the same way. 7/10 should round to 0.70000004768 and 8/10 should round to 0.80000001192.

In this example I am always rounding up, but I have some places where I want to always round down. Fortunately, I am only dealing with positive values in each of these places.

The line I am using to round is floor(val * 100 + 0.5) / 100. I am programming in C++.

+9  A: 

I think the best way to achieve this is to rely on the fact that according to the IEEE 754 floating point standard, the integer representation of floating point bits are lexicographically ordered as a 2-complement integer.

I.e. you could simply add one ulp (units in the last place) to get the next floating point representation (which will always be slightly larger than your treshold if it was smaller, since the round error is at most 1/2 ulp)

e.g.

 float floatValue = 7.f/10;
 std::cout << std::setprecision(20) << floatValue << std::endl;
 int asInt = *(int*)&floatValue;
 asInt += 1;
 floatValue = *(float*)&asInt;
 std::cout << floatValue << std::endl;

prints (on my system)

 0.69999998807907104492
 0.70000004768371582031

To know when you need to add one ulp, you'll have to rely on the difference of floor and a rounded floor

 if (std::floor(floatValue * 100.) != std::floor(floatValue * 100. + 0.5)) {
    int asInt = *(int*)&floatValue;
    asInt += 1;
    floatValue = *(float*)&asInt;
 }

Would correctly convert 0.69.. to 0.70.. but leave 0.80.. alone.

Note that the float gets promoted to a double via the multiplication with 100. before the floor is applied.

If you don't do this you risk getting in the situation that for

 7.f/10.f * 100.f

The (limited in precision) float representation would be 70.00...

Pieter
+1 for awesome bit-twiddling jiggedy pokery. I feel smarter after reading that.
sheepsimulator
cool :) simpler would be to use aliasing directly: (int
I just discovered a function in C99 called nexttoward which can be used to increment the ulp, nicely covering up the bit-twiddling.http://www.penguin-soft.com/penguin/man/3/nextafter.html