tags:

views:

3571

answers:

2

Suppose that we have a System.Decimal number.

For illustration, let's take one whose ToString() representation is as follows:

d.ToString() = "123.4500"

The following can be said about this Decimal. For our purposes here, scale is defined as the number of digits to the right of the decimal point. Effective scale is similar but ignores any trailing zeros that occur in the fractional part. (In other words, these parameters are defined like SQL decimals plus some additional parameters to account for the System.Decimal concept of trailing zeros in the fractional part.)

  • Precision: 7
  • Scale: 4
  • EffectivePrecision: 5
  • EffectiveScale: 2

Given an arbitrary System.Decimal, how can I compute all four of these parameters efficiently and without converting to a String and examining the String? The solution probably requires Decimal.GetBits.

Some more examples:

Examples Precision  Scale  EffectivePrecision  EffectiveScale
0        1 (?)      0      1 (?)               0
0.0      2 (?)      1      1 (?)               0
12.45    4          2      4                   2
12.4500  6          4      4                   2
770      3          0      3                   0

(?) Alternatively interpreting these precisions as zero would be fine.

+6  A: 

Yes, you'd need to use Decimal.GetBits. Unfortunately, you then have to work with a 96-bit integer, and there are no simple integer type in .NET which copes with 96 bits. On the other hand, it's possible that you could use Decimal itself...

Here's some code which produces the same numbers as your examples. Hope you find it useful :)

using System;

public class Test
{
    static public void Main(string[] x)
    {
        ShowInfo(123.4500m);
        ShowInfo(0m);
        ShowInfo(0.0m);
        ShowInfo(12.45m);
        ShowInfo(12.4500m);
        ShowInfo(770m);
    }

    static void ShowInfo(decimal dec)
    {
        // We want the integer parts as uint
        // C# doesn't permit int[] to uint[] conversion,
        // but .NET does. This is somewhat evil...
        uint[] bits = (uint[])(object)decimal.GetBits(dec);


        decimal mantissa = 
            (bits[2] * 4294967296m * 4294967296m) +
            (bits[1] * 4294967296m) +
            bits[0];

        uint scale = (bits[3] >> 16) & 31;

        // Precision: number of times we can divide
        // by 10 before we get to 0        
        uint precision = 0;
        if (dec != 0m)
        {
            for (decimal tmp = mantissa; tmp >= 1; tmp /= 10)
            {
                precision++;
            }
        }
        else
        {
            // Handle zero differently. It's odd.
            precision = scale + 1;
        }

        uint trailingZeros = 0;
        for (decimal tmp = mantissa;
             tmp % 10m == 0 && trailingZeros < scale;
             tmp /= 10)
        {
            trailingZeros++;
        }

        Console.WriteLine("Example: {0}", dec);
        Console.WriteLine("Precision: {0}", precision);
        Console.WriteLine("Scale: {0}", scale);
        Console.WriteLine("EffectivePrecision: {0}",
                          precision - trailingZeros);
        Console.WriteLine("EffectiveScale: {0}", scale - trailingZeros);
        Console.WriteLine();
    }
}
Jon Skeet
Thank you, this is very interesting. Not nearly as fast as scraping the information out of ToString, as I show in a separate post.
binarycoder
Jon, if you think your (uint[])(object) cast is evil (I agree), then why don't you use a slightly more neat and explicit way?
Joren
I'll point out that this code returns precision and scale using scientific notation definitions. For example, 0.005 has Precision=1 and Scale=3. That is the ordinary way of doing things, but another interpretation of these parameters is per the SQL decimal type. This way, Precision=3 and Scale=3. That's the smallest SQL decimal you could use to hold 0.005. The original question briefly mentioned SQL decimals (but didn't give an example of this case, thanks to Jon Seigel who recently pointed it out to me). My implementation (FastInfo below) gives the latter kind of precision and scale. Thanks!
binarycoder
@Joren: Because C# doesn't let you do the direct cast between int[] and uint[] even though the CLR does. Which more neat and explicit way were you thinking of?
Jon Skeet
Something like `Select(x => (uint)x).ToArray();`. I think it's clearly explicit. It's pretty simple and it doesn't violate C# casting rules, so I think it's also neat.
Joren
@Joren: While I'm not one for micro-optimising, I think that gratuitously creating a new iterator, a new array etc when a cast will actually work is a bit much. I think the comment explains pretty clearly what's going on - I find the comment + existing code clearer than the use of LINQ in this case. I think we'll have to agree to disagree.
Jon Skeet
Fair enough. :)
Joren
+2  A: 

Using ToString is about 10x faster than Jon Skeet's solution. While this is reasonably fast, the challenge here (if there are any takers!) is to beat the performance of ToString.

The performance results I get from the following test program are: ShowInfo 239 ms FastInfo 25 ms

using System;
using System.Diagnostics;
using System.Globalization;

public class Test
{
    static public void Main(string[] x)
    {
        Stopwatch sw1 = new Stopwatch();
        Stopwatch sw2 = new Stopwatch();

        sw1.Start();
        for (int i = 0; i < 10000; i++)
        {
            ShowInfo(123.4500m);
            ShowInfo(0m);
            ShowInfo(0.0m);
            ShowInfo(12.45m);
            ShowInfo(12.4500m);
            ShowInfo(770m);
        }
        sw1.Stop();

        sw2.Start();
        for (int i = 0; i < 10000; i++)
        {
            FastInfo(123.4500m);
            FastInfo(0m);
            FastInfo(0.0m);
            FastInfo(12.45m);
            FastInfo(12.4500m);
            FastInfo(770m);
        }
        sw2.Stop();

        Console.WriteLine(sw1.ElapsedMilliseconds);
        Console.WriteLine(sw2.ElapsedMilliseconds);
        Console.ReadLine();
    }

    // Be aware of how this method handles edge cases.
    // A few are counterintuitive, like the 0.0 case.
    // Also note that the goal is to report a precision
    // and scale that can be used to store the number in
    // an SQL DECIMAL type, so this does not correspond to
    // how precision and scale are defined for scientific
    // notation. The minimal precision SQL decimal can
    // be calculated by subtracting TrailingZeros as follows:
    // DECIMAL(Precision - TrailingZeros, Scale - TrailingZeros).
    //
    //     dec Precision Scale TrailingZeros
    // ------- --------- ----- -------------
    //   0             1     0             0
    // 0.0             2     1             1
    // 0.1             1     1             0
    // 0.01            2     2             0 [Diff result than ShowInfo]
    // 0.010           3     3             1 [Diff result than ShowInfo]
    // 12.45           4     2             0
    // 12.4500         6     4             2
    // 770             3     0             0
    static DecimalInfo FastInfo(decimal dec)
    {
        string s = dec.ToString(CultureInfo.InvariantCulture);

        int precision = 0;
        int scale = 0;
        int trailingZeros = 0;
        bool inFraction = false;
        bool nonZeroSeen = false;

        foreach (char c in s)
        {
            if (inFraction)
            {
                if (c == '0')
                    trailingZeros++;
                else
                {
                    nonZeroSeen = true;
                    trailingZeros = 0;
                }

                precision++;
                scale++;
            }
            else
            {
                if (c == '.')
                {
                    inFraction = true;
                }
                else if (c != '-')
                {
                    if (c != '0' || nonZeroSeen)
                    {
                        nonZeroSeen = true;
                        precision++;
                    }
                }
            }
        }

        // Handles cases where all digits are zeros.
        if (!nonZeroSeen)
            precision += 1;

        return new DecimalInfo(precision, scale, trailingZeros);
    }

    struct DecimalInfo
    {
        public int Precision { get; private set; }
        public int Scale { get; private set; }
        public int TrailingZeros { get; private set; }

        public DecimalInfo(int precision, int scale, int trailingZeros)
            : this()
        {
            Precision = precision;
            Scale = scale;
            TrailingZeros = trailingZeros;
        }
    }

    static DecimalInfo ShowInfo(decimal dec)
    {
        // We want the integer parts as uint
        // C# doesn't permit int[] to uint[] conversion,
        // but .NET does. This is somewhat evil...
        uint[] bits = (uint[])(object)decimal.GetBits(dec);


        decimal mantissa =
            (bits[2] * 4294967296m * 4294967296m) +
            (bits[1] * 4294967296m) +
            bits[0];

        uint scale = (bits[3] >> 16) & 31;

        // Precision: number of times we can divide
        // by 10 before we get to 0 
        uint precision = 0;
        if (dec != 0m)
        {
            for (decimal tmp = mantissa; tmp >= 1; tmp /= 10)
            {
                precision++;
            }
        }
        else
        {
            // Handle zero differently. It's odd.
            precision = scale + 1;
        }

        uint trailingZeros = 0;
        for (decimal tmp = mantissa;
            tmp % 10m == 0 && trailingZeros < scale;
            tmp /= 10)
        {
            trailingZeros++;
        }

        return new DecimalInfo((int)precision, (int)scale, (int)trailingZeros);
    }
}
binarycoder
I'm not entirely surprised - we're doing a lot of operations with decimals simply due to the lack of a 96-bit integer type. If you use ulong instead of decimal for the mantissa by completely ignoring the top 32 bits, it's slightly faster than FastInfo - but of course, it doesn't work for all decimals!I suspect we can improve the speed by counting both the precision and the trailing zeros in one loop (as both divide by 10 each time).
Jon Skeet
The string-based algorithm produces incorrect results for numbers with leading zeros, i.e., 0.555, and numbers that have zeros between the decimal and significant digits, i.e., 0.0005.
Jon Seigel
<produces incorrect results> Thanks, I have adapted the code. I have also added a comment to the code above the FastInfo method. This is to point out that the method uses SQL precision and scale definitions, not the usual scientific notation ones. (The mention that this is to deal with SQL decimals was in the original question.)
binarycoder
Given that SQL stores decimal values with false precision, I'm not sure they have any business defining their own standards of precision, scale, etc. I still have to use Jon's algorithm for my program because it gives me the results I expect, but thanks for fixing your code. +1
Jon Seigel