views:

1485

answers:

4

Summary of the problem:

For some decimal values, when we convert the type from decimal to double, a small fraction is added to the result.

What makes it worse, is that there can be two "equal" decimal values that result in different double values when converted.

Code sample:

decimal dcm = 8224055000.0000000000m;  // dcm = 8224055000
double dbl = Convert.ToDouble(dcm);    // dbl = 8224055000.000001

decimal dcm2 = Convert.ToDecimal(dbl); // dcm2 = 8224055000
double dbl2 = Convert.ToDouble(dcm2);  // dbl2 = 8224055000.0

decimal deltaDcm = dcm2 - dcm;         // deltaDcm = 0
double deltaDbl = dbl2 - dbl;          // deltaDbl = -0.00000095367431640625

Look at the results in the comments. Results are copied from debugger's watch. The numbers that produce this effect have far less decimal digits than the limit of the data types, so it can't be an overflow (I guess!).

What makes it much more interesting is that there can be two equal decimal values (in the code sample above, see "dcm" and "dcm2", with "deltaDcm" equal to zero) resulting in different double values when converted. (In the code, "dbl" and "dbl2", which have a non-zero "deltaDbl")

I guess it should be something related to difference in the bitwise representation of the numbers in the two data types, but can't figure out what! And I need to know what to do to make the conversion the way I need it to be. (like dcm2 -> dbl2)

+2  A: 

The article What Every Computer Scientist Should Know About Floating-Point Arithmetic would be an excellent place to start.

The short answer is that floating-point binary arithmetic is necessarily an approximation, and it's not always the approximation you would guess. This is because CPUs do arithmetic in base 2, while humans (usually) do arithmetic in base 10. There are a wide variety of unexpected effects that stem from this.

Greg Hewgill
Thanks for the article link, it's a very long one but I will try to read it.Base 2 arithmetic vs. Base 10 arithmetic is what I was suspicious of, but there are two points:1. decimal has 28-29 significant digits, and double has 15-16 significant digits. 8 significant digits are enough for my number. Why should it treat like that? And as long as there is a representation of the original number in double, why the conversion should result in another one?2. What about the two "same" decimal values getting converted to different doubles?
Iravanchi
The number of significant digits isn't particularly relevant - "0.1" only has one significant digit, but still isn't representable in float/double. The point about there *being* an exact representation available is a much more significant one. As for the two values giving different doubles - they're *equal* but they're not the *same*.
Jon Skeet
Is there a way of converting those "equal but not same" decimals to each other? And is there a way to see that in the debugger? (I guess I should see the bitwise representation, but there's not such an option in VS. And "Hexadecimal display" doesn't work this way either)
Iravanchi
Decimal.GetBits will give you the bitwise representation - you'd want to normalize by way of that. It won't be easy :( Do you know that the value is *actually* an integer? If so, that would help...
Jon Skeet
The number is "actually" an integer for this instance. But it can be a non-integer. What's for sure, is that it doesn't (and won't) have 16 significant digits.
Iravanchi
Irchi: it's very easy to convert "equal but not same" decimals, see my code sample. Also `Decimal.ToString()` faithfully shows you trailing zeros, so you can distinguish "equal but not same" decimals by that even in VS debugger's normal watch etc. windows.
Anton Tykhyy
A: 

This is an old problem, and has been the subject of many similar questions on StackOverflow.

The simplistic explanation is that decimal numbers can't be exactly represented in binary

This link is an article which might explain the problem.

pavium
That doesn't explain it, actually. *Many* decimal numbers can't be exactly in binary - but in this case the input *can* be exactly represented in binary. Data is being lost unnecessarily.
Jon Skeet
Jon, data isn't being lost, on the contrary — it's the *unnecessarily preserved* (from Irchi's POV, no offense) data that's the trouble.
Anton Tykhyy
Anton, see the spec posted by Jon.The unnecessarily-preserved data should not ruin the conversion. After the 16 significant digits, the decimal value specifies the digits to be all "0". Why should it be rounded to "1" in the 16th position?! "0" is closer to the "exact" decimal value than "1".
Iravanchi
I don't know about 'should', not being a standards man — but this is how it behaves and the only question is what to do about this behaviour.
Anton Tykhyy
@Jon, I have *emphasised* the word 'simplistic' in my answer, for the record.
pavium
@pavium: My point is that I don't believe this is really a case of the normal situation which has led to many other questions. Usually, you have a number which *cannot be exactly represented* as a double. In this case, we don't.
Jon Skeet
@Jon, yes you're right, I hadn't looked at it that carefully: the numbers are pretty much integers. There's no reason why data should be 'lost', it must be in the `decimal` to `double` conversion
pavium
+11  A: 

Interesting - although I generally don't trust normal ways of writing out floating point values when you're interested in the exact results.

Here's a slightly simpler demonstration, using DoubleConverter.cs which I've used a few times before.

using System;

class Test
{
    static void Main()
    {
        decimal dcm1 = 8224055000.0000000000m;
        decimal dcm2 = 8224055000m;
        double dbl1 = (double) dcm1;
        double dbl2 = (double) dcm2;

        Console.WriteLine(DoubleConverter.ToExactString(dbl1));
        Console.WriteLine(DoubleConverter.ToExactString(dbl2));
    }
}

Results:

8224055000.00000095367431640625
8224055000

Now the question is why the original value (8224055000.0000000000) which is an integer - and exactly representable as a double - ends up with extra data in. I strongly suspect it's due to quirks in the algorithm used to convert from decimal to double, but it's unfortunate.

It also violates section 6.2.1 of the C# spec:

For a conversion from decimal to float or double, the decimal value is rounded to the nearest double or float value. While this conversion may lose precision, it never causes an exception to be thrown.

The "nearest double value" is clearly just 8224055000... so this is a bug IMO. It's not one I'd expect to get fixed any time soon though. (It gives the same results in .NET 4.0b1 by the way.)

To avoid the bug, you probably want to normalize the decimal value first, effectively "removing" the extra 0s after the decimal point. This is somewhat tricky as it involves 96-bit integer arithmetic - the .NET 4.0 BigInteger class may well make it easier, but that may not be an option for you.

Jon Skeet
+1. As usual, a very nice explanation! :)
Mitch Wheat
This is a bug IMO too.Have you/anyone reported this to Microsoft? I'm searching MS Connect and can't see anything related. So, I'm posting it.Just want to know if they do confirm this as a bug or not.
Iravanchi
96-bit arithmetic is not necessary in this particular case, because one can get `decimal` to do the heavy lifting :)
Anton Tykhyy
Fascinating bug! As Anton Tykhyy notes, this is almost certainly because the representation of decimals with lots of extra precision is no longer "natively" in integers that fit into a double without representation error. I would be willing to bet up to a dollar that this bug has been in OLE Automation for fifteen years -- we use the OA libraries for decimal coding. I happen to have an archive of OA sources from ten years ago on my machine; if I have some free time tomorrow I'll take a look.
Eric Lippert
Customer support doesn't get much better than this :)
Jon Skeet
@Jon, I've used a part of your answer when reporting this issue on MS Connect (The C# spec part). Thanks for the info.
Iravanchi
@Irchi - if you've already got Eric's ear, you've got a good head-start ;-p
Marc Gravell
+11  A: 

The answer lies in the fact that decimal attempts to preserve the number of significant digits. Thus, 8224055000.0000000000m has 20 significant digits and is stored as 82240550000000000000E-10, while 8224055000m has only 10 and is stored as 8224055000E+0. double's mantissa is (logically) 53 bits, i.e. at most 16 decimal digits. This is exactly the precision you get when you convert to double, and indeed the stray 1 in your example is in the 16th decimal place. The conversion isn't 1-to-1 because double uses base 2.

Here are the binary representations of your numbers:

dcm:
00000000000010100000000000000000 00000000000000000000000000000100
01110101010100010010000001111110 11110010110000000110000000000000
dbl:
0.10000011111.1110101000110001000111101101100000000000000000000001
dcm2:
00000000000000000000000000000000 00000000000000000000000000000000
00000000000000000000000000000001 11101010001100010001111011011000
dbl2 (8224055000.0):
0.10000011111.1110101000110001000111101101100000000000000000000000

For double, I used dots to delimit sign, exponent and mantissa fields; for decimal, see MSDN on decimal.GetBits, but essentially the last 96 bits are the mantissa. Note how the mantissa bits of dcm2 and the most significant bits of dbl2 coincide exactly (don't forget about the implicit 1 bit in double's mantissa), and in fact these bits represent 8224055000. The mantissa bits of dbl are the same as in dcm2 and dbl2 but for the nasty 1 in the least significant bit. The exponent of dcm is 10, and the mantissa is 82240550000000000000.

Update II: It is actually very easy to lop off trailing zeros.

// There are 28 trailing zeros in this constant —
// no decimal can have more than 28 trailing zeros
const decimal PreciseOne = 1.000000000000000000000000000000000000000000000000m ;

// decimal.ToString() faithfully prints trailing zeroes
Assert ((8224055000.000000000m).ToString () == "8224055000.000000000") ;

// Let System.Decimal.Divide() do all the work
Assert ((8224055000.000000000m / PreciseOne).ToString () == "8224055000") ;
Assert ((8224055000.000010000m / PreciseOne).ToString () == "8224055000.00001") ;
Anton Tykhyy
This makes sense, but see Jon Skeet's answer.Logically, specifying more significant digits should result in a more accurate conversion, not a worse one!Is there a way to convert the decimal to one which has "less" significant digits? This should result in a better conversion in my case!
Iravanchi
The conversion *is* more accurate — you get 6 extra digits — but the result is not what you expect because decimal's and double's bases are different. I'll add example momentarily.
Anton Tykhyy
It's not a more accurate conversion. The exact value of the decimal is available, so should be returned. I can see why it happens, but that doesn't make it right :)
Jon Skeet
Well, if you understand "accurate" in this sense, I agree.
Anton Tykhyy
Thanks, binary representations are very helpful.I know this is what happens, but my question is "why" exactly does the conversion place that "1" at the end? The conversion from a higher-precision value to a lower-precision value is supposed to perform rounding. It's like when converting "0.0" to integer, it results 0, but converting "0.000000000000" to integer results 1!
Iravanchi
It is rounding, but not to a decimal digit (base 10), but to a binary digit (base 2).
Blindy
@Blindy Rounding to Base 2 should produce the same result. Why when more digits are specified in the decimal it gets rounded to "1" but when fewer digits are specified it gets rounded to "0" ?
Iravanchi
That's a clever bit of code, but I don't know whether the algorithm for deciding the resulting number of significant digits is well-specified. I think I'd personally rather use a big integer library and do it from a more theoretical point of view - but as I said, it's a nifty bit of code.
Jon Skeet
As for "accurate" - a fairly simple measure of accuracy is "what's the difference between the exact number being represented to start with, and the the exact value of the result of the conversion"? 0 represents complete accuracy - at least in terms of the magnitude of the number, and is available in this case. That's what I meant. As double doesn't have a concept of "the number of significant digits" I don't believe the accuracy can measured in those terms. (It could for other conversions, e.g. to another type which *did* preserve the number of significant digits.)
Jon Skeet
Good point about double not having significant digits. Also, there is a much easier way to trim off the trailing zeros than that piece of code I wrote (see code sample).
Anton Tykhyy