tags:

views:

88

answers:

2

I found that dictionary lookup could be very slow if compared to flat array access. Any idea why? I'm using Ants Profiler for performance testing. Here's a sample function that reproduces the problem:

    private static void NodeDisplace()
    {
        var nodeDisplacement = new Dictionary<double, double[]>();

        var times = new List<double>();
        for (int i = 0; i < 6000; i++)
        {
            times.Add(i * 0.02);
        }
        foreach (var time in times)
        {
            nodeDisplacement.Add(time, new double[6]);
        }

        var five = 5;
        var six = 6;
        int modes = 10;
        var arrayList = new double[times.Count*6];
        for (int i = 0; i < modes; i++)
        {
            int k=0;
            foreach (var time in times)
            {
                for (int j = 0; j < 6; j++)
                {

                    var simpelCompute = five * six;  // 0.027 sec
                    nodeDisplacement[time][j] = simpelCompute;  //0.403 sec
                    arrayList[6*k+j] = simpelCompute;  //0.0278 sec
                }

                k++;
            }
        }
    }

Notice the relative magnitude between flat array access and dictionary access? Flat array is about 20 times faster than dictionary access ( 0.403/0.0278), after taking into account of the array index manipulation ( 6*k+j).

As weird as it sounds, but dictionary lookup is taking a major portion of my time, and I have to optimize it.

+9  A: 

Yes, I'm not surprised. The point of dictionaries is that they're used to look up arbitrary keys. Consider what has to happen for a single array dereference:

  • Check bounds
  • Multiply index by element size
  • Add index to pointer

Very, very fast. Now for a dictionary lookup (very rough; depends on implementation):

  • Potentially check key for nullity
  • Take hash code of key
  • Find the right slot for that hash code (probably a "mod prime" operation)
  • Probably dereference an array element to find the information for that slot
  • Compare hash codes
  • If the hash codes match, compare for equality (and potentially go on to the next hash code match)

If you've got "keys" which can very easily be used as array indexes instead (e.g. contiguous integers, or something which can easily be mapped to contiguous integers) then that will be very, very fast. That's not the primary use case for hash tables. They're good for situations which can't easily be mapped that way - for example looking up by string, or by arbitrary double value (rather than doubles which are evenly spaced, and can thus be mapped to integers easily).

I would say that your title is misleading - it's not that dictionary lookup is slow, it's that when arrays are a more suitable approach, they're ludicrously fast.

Jon Skeet
@Jon, I *am* using `double` that is evenly spaced ( see the above code), and yet it is still slow.
Ngu Soon Hui
@Ngu: That's my point - you're using a range which is easily mapped to integers, so use an array instead. Use the most appropriate data structure for the job - which in this case is an array.
Jon Skeet
+2  A: 

In addition the Jon's answer I would like to add that your inner loop does not do very much, normally you do a least some more work in the inner loop and then the relative performance loss of the dictionary is somewhat lower.

If you look at the code for Double.GetHashCode() in Reflector you'll find that it is executing 4 lines of code (assuming your double is not 0), just that is more than the body of your inner loop. Dictionary<TKey, TValue>.Insert() (called by the set indexer) is even more code, almost a screen full.

The thing with Dictionary compared to a flat array is that you don't waste to much memory when your keys are not dense (as they are in your case) and that read and write are ~O(1) like arrays (but with a higher constant).

As a side note you can use a multi dimensional array instead of the 6*k+j trick.
Declare it this way

var arrayList = new double[times.Count, 6];

and use it this way

arrayList[k ,j] = simpelCompute;

It won't be faster, but it is easier to read.

Albin Sunnanbo