views:

140

answers:

3

I'm working on a scientific computation & visualization project in C#/.NET, and we use doubles to represent all the physical quantities. Since floating-point numbers always include a bit of rounding, we have simple methods to do equality comparisons, such as:

static double EPSILON = 1e-6;

bool ApproxEquals(double d1, double d2) {
    return Math.Abs(d1 - d2) < EPSILON;
}

Pretty standard.

However, we constantly find ourselves having to adjust the magnitude of EPSILON as we encounter situations in which the error of "equal" quantities is greater than we had anticipated. For example, if you multiply 5 large doubles together and then divide 5 times, you lose a lot of accuracy. It's gotten to the point where we can't make EPSILON too much larger or else it's going to give us false positives, but we still get false negatives as well.

In general our approach has been to look for more numerically-stable algorithms to work with, but the program is very computational and there's only so much we've been able to do.

Does anyone have any good strategies for dealing with this problem? I've looked into the Decimal type a bit, but am concerned about performance and I don't really know enough about it to know if it would solve the problem or only obscure it. I would be willing to accept a moderate performance hit (say, 2x) by going to Decimal if it would fix these problems, but performance is definitely a concern and since the code is mostly limited by floating-point arithmetic, I don't think it's an unreasonable concern. I've seen people quoting a 100x difference, which would definitely be unacceptable.

Also, switching to Decimal has other complications, such as general lack of support in the Math library, so we would have to write our own square root function, for example.

Any advice?

EDIT: by the way, the fact that I'm using a constant epsilon (instead of a relative comparison) is not the point of my question. I just put that there as an example, it's not actually a snippit of my code. Changing to a relative comparison wouldn't make a difference for the question, because the problem arises from losing precision when numbers get very big and then small again. For example, I might have a value 1000 and then I do a series of calculations on it that should result in exactly the same number, but due to loss of precision I actually have 1001. If I then go to compare those numbers, it doesn't matter much if I use a relative or absolute comparison (as long as I've defined the comparisons in a way that are meaningful to the problem and scale).

Anyway, as Mitch Wheat suggested, reordering of the algorithms did help with the problems.

+2  A: 

This is not a problem unique to .NET. The strategy to reduce loss of precision is to re-order calculations so that you multiply large quantities times small quantities and add/subtract similiar sized quantities (without changing the nature of the calculation, obviously).

In your example, rather than multiply 5 large quantities together and then divide by 5 large quantities, re-order to divide each large quantity by one of the divisors, and then multiply these 5 partial results.

Of interest? (if you haven't already read): What Every Computer Scientist Should Know About Floating-Point Arithmetic

Mitch Wheat
@nobugz: if you feel that strongly, undelete and edit your answer and I'll remove the downvote.
Mitch Wheat
+1. Crunching numbers on computers, as you have discovered, is full of snares and pitfalls, which is what makes doing it so much fun. In your particular case you might want to consider using a measure of relative error rather than the absolute error you are using. If you are unsure of the difference between absolute and relative errors, re-read chapter 1 of your favourite Numerical Analysis text.
High Performance Mark
@High-Performance Mark: was this comment intended for the OP?
Mitch Wheat
@Mitch -- the OP and everyone else bar you and a few others on SO who have detailed understanding of everything f-p. I'm not trying to teach my grandmother to suck eggs, more a niece really.
High Performance Mark
Well I've been going through our algorithms and doing some reordering as you suggested, and it seems to have helped. I've started to work my way through that article, although it's not exactly light reading! Thanks for the help.
Henry Jackson
@High-Performance Mark: the question of relative/absolute comparison was not the one I was asking. See the edit to my question.
Henry Jackson
@Henry Jackson: Agreed: not light reading, but very important if you are working with high precision numerical calculations.
Mitch Wheat
+1  A: 

Due to the way real numbers are typically represented, You can do this in C (and probably in unsafe C#):

if (llabs(*(long long)&x - *(long long)&y) <= EPSILON) {
    // Close enough
}

This is obviously non-portable and probably a bad idea, but it has the significant advantage of scale-independence. That is, EPSILON can be some small constant like 1, 10 or 100 (depending on your desired tolerance), and it will handle proportional rounding errors correctly regardless of the exponent.

Disclaimer: This is my own private invention, and has not been vetted by anyone with a clue (like, say, a mathematician with a background in discrete arithmetic).

Marcelo Cantos
Let me rephrase that: How is that any better than the original floating point tolerance (epsilon) technique?
Mitch Wheat
Epsilon normally has to be scaled in proportion to the magnitude of the numbers you expect to deal with, which makes it vulnerable to surprises in the magnitude of actual data. This is precisely the problem stated in the question. Comparing the integer representation of the numbers isn't vulnerable to such scaling issues (note that the choice of EPSILON in my answer is determined by the desired tolerance, not by the expected scale of the inputs). @John Knoeller's answer is the best safe-and-portable answer, but it is more expensive (albeit probably marginally so).
Marcelo Cantos
+1  A: 

Your best answer is always better algorithms, of course. But it seems to me that if your values aren't all within a couple of orders of magnitude of 1, the using a fixed epsilon is not a good strategy. What you want to do instead is to insure that the values are equal to within some reasonable precision.

// are values equal to within 12 (or so) digits of precision?
//
bool ApproxEquals(double d1, double d2) {
    return Math.Abs(d1 - d2) < (Math.Abs(d1) * 1e-12);
}

If this were C++, then you could also pull some tricks to compare the mantissa and exponent separately, but I can't think of any way to do this safely in unmanged code.

John Knoeller
As I've added in an edit to the question, this doesn't really address the question I had. But it's true that a relative comparison is appropriate when the comparisons are being done on numbers with varying orders of magnitude.
Henry Jackson