views:

191

answers:

5

I'm doing a Dictionary<> lookup in an O(n^2) loop and need it to be ridiculously fast. It's not. Does anyone have any insight into how Dictionary<> is implemented? I'm testing Dictionary performance with an isolated test case after running my code through a profiler and determining Dictionary lookups are the bulk of the CPU time.. My test code is like this:

Int32[] keys = new Int32[10] { 38784, 19294, 109574, 2450985, 5, 398, 98405, 12093, 909802, 38294394 };

Dictionary<Int32, MyData> map = new Dictionary<Int32, MyData>();
//Add a bunch of things to map


timer.Start();
Object item;
for (int i = 0; i < 1000000; i++)
{
   for (int j = 0; j < keys.Length; j++)
   {
      bool isFound = map.ContainsKey(keys[j]);
      if (isFound)
      {
         item = map[keys[j]];
      }
   }
}
timer.Stop();

ContainsKey and map[] are the two slow parts (equally slow).. If i add a TryGetValue, it's nearly identical in speed to ContainsKey. Here's some interesting facts..

A Dictionary<Guid, T> is about twice as slow as Dictionary<Int32, T>. Dictionary<String, T> is about twice as slow as a Guid dictionary. A Dictionary<Byte, T> is a good 50% faster than using Ints. This leads me to believe that a Dictionary is doing an O(log n) binary search to find the key, and the comparison operators on the keys are the bottleneck. For some reason, I don't believe it's implemented as a Hashtable, because .NET already has a Hashtable class, and in my experience it's even slower than Dictionary.

The dictionaries I'm building are only accessed by one thread at a time, so read locking is not an issue. RAM is also not an issue. The dictionary will most likely only have about 10 buckets, but each bucket can point to one of about 2,000 possibly things. Does anyone have any feedback on how to make this faster? Thanks!

Mike

A: 

If you only have 10 buckets with 2000 things each could you just build a single list with all 20000 things which can be directly indexed by a key known to your loop? For example:

List<MyData> = new List(); 

//add all items to list indexed by their key (RAM is not an issue right?)

item = ItemList[key];

This way you reference them directly with no dictionary or hash lookup.

Joshua
Yea but building that list would then take way too long. I do that several hundred thousand times..
Mike
How it List<> implemented? If it's not doing a hash, is lookup O(N)?
Mike
Oh dumb question, List<> indexes by offset like an array.. But I'm guessing Contains is O(N) or O(log N) when you pass in a compare function.
Mike
Mike: Contains is O(N); BinarySearch is O(lgN)
Gabe
Yeah I figured that's why List.Contains() has an override to pass in a comparer, then they could do a binary search.. This makes sense..
Mike
+5  A: 

The dictionary is implemented using a hash table, I have looked at the code using Reflector a while back.

"The dictionary will most likely only have about 10 buckets, but each bucket can point to one of about 2,000 possibly things."

There is your problem. The dictionary uses the hash to locate the bucket, but the lookup in the bucket is linear.

You have to implement a hash algorithm with a better distribution to get better performance. The relation should be at least the opposite, i.e. 2000 buckets with 10 items each.

Guffa
+1 - exactly the problem. The hash algo is totally borked (as in: inefficient and not properly implemented).
TomTom
Well that explains why Int32 is so much faster.. Int32 just uses itself for the hash code, where-as Guid and String probably contain some collisions.. My test with an Int32 Dictionary can do about 10M lookups in 900ms, which is probably about as good as I'm gonna get..
Mike
So let me get this straight.. Dictionary slows down when multiple keys "hash" to the same bucket, correct? Once you fix that, a lookup is O(1)? This basically answers my question..
Mike
Ok this seems to be working, once I implement GetHashCode on my key, my speeds are about identical to a Dictionary with an Int32 key.
Mike
Mike: In the example you have above, with .Net 4.0, `map` is a hash table with 17 buckets, of which 7 are filled. That means for every 10 keys you look up, 3 will have a collison. In your case, the collisions are 98405, 109574 in bucket 9 and 38294394, 398, 38784 in bucket 7. This means 38784 will require 3 comparisons, 109574 and 398 will require 2, and all other lookups will require 1 comparison.
Gabe
If I recall correctly, the bucket will be computed with hashvalue % n - where n is the number of buckets, correct? If this is the case, how does Dictionary<> know what n is? Does it get tricky and try to resize n on the fly, as you add more stuff? Is there a way to set n? I want to prevent /all/ collisions.. I don't care about memory in this case. Thanks!
Mike
@Mike: The dictionary uses something tricky, where n is a prime number, I don't know exactly what it does when the dictionary resizes. Anyhow, that is implemention details, and the dictionary combines buckets as part of the implementation. You could try to figure out how that works, but the implementation may change and you may end up hurting performance instead.
Guffa
Is Dictionary(int) the constructor to set the number of buckets initially? Maybe I'm better off writing my own hashtable class.
Mike
@Mike: No, it only sets it's capacity. That somehow affects how many actual buckets there are, but that's an implementation detail that you should not try to tamper with. The dictionary reduces the number of buckets to keep the memory footprint smaller, which makes it more likely that the dictionary fits in the cache, which would make it a lot faster.
Guffa
Well the good news is I got my code from running in about 8 sec to about 2.1 sec.. The bottleneck is still in hashmap lookups. Also, adding values to the hash table accounts for 14% of the function time, which leads me to believe it's running into collisions and appending to a linked list, or "adjusting" itself. I think I might write my own hashtable class just for kicks..
Mike
A: 

It sounds like you're saying that your dictionary will only have 10 items in it. If so, a hash table may be unwarranted. You can just store your data in a list/array and either iterate over it or use a binary search to find your keys (try both to see what's faster).

If you use a binary search, your list will have to be sorted; if you just iterate over your list and there are some keys that are looked-up more frequently than others, you can put them at the beginning of the list to speed things up.

On the other hand, if your keys are known in advance, you can write your own implementation of a hash table with a fast and perfect hash function (i.e. no collisions), and that should be unbeatable.

Gabe
How do you figure? Shouldn't a hash table be O(1)? That sounds better than iterating over ~10 items.. Doing a binary search to find the keys would mean sorting my data structure, which is possible to do, but a bit of work.. In theory, hash table should still be quicker.. If constructing a dictionary was slow, I'd have seen that while profiling..
Mike
You only have 10 keys, right? That means your lookup will be O(10), which is the same as O(1). In other words, your computer may be able to iterate a list of 10 items faster than it can call the hash function and perform the integer division required to index into the hash table.
Gabe
Oh yea I gotcha now.. Good point, there's probably a minimum number of keys in a dictionary before hashing really pays off.. I will definitely run some tests to confirm this theory..
Mike
Just tried it, and even with 10 values in the dictionary, computing the hash is much much MUUUUUCH faster than an O(10) lookup.. 1786ms vs. 437ms.
Mike
Oh also, my speeds are about double with a Release build.. JIT'er must be doing some nifty inlining or something..
Mike
I would expect that the hash table would be faster, but it depends on your hash function. In any event, it's hard to believe that twice as fast as 437ms wouldn't be fast enough for you.
Gabe
Oh man I'll be ecstatic if I get my real code to run in 437ms. I'm just doing tests here, and my loops simulate a typical scenario my actual code would run into. However, 437ms is with only one hash lookup per iteration (I'll probably have to do at least 2 or 3).. It's also using ints as keys, which I have to refactor a whole bunch of code to do. Also, the 437ms is on a ship build.. It's about 700ish on debug.
Mike
A: 

The insight into the inner workings of the hash table are spot on. You should definitely be using TryGetValue as your entire inner loop:

  map.TryGetValue(keys[j], out item);

doing ContainsKey and Item[] is doing the hard part (the lookup) twice. An extra if and an extra keys[j] are minor, but will add up in a tight loop. Using a foreach over your keys will probably be slower, but depending on the actual contents of the loop, it might be worth profiling.

Dolphin
A: 

Adding to the comments about creating your own implementation based on knowing the data, here is an example that will have no clashes. This may throw OutOfMemoryExceptions based on the size of the objects. I tried using an int indexer but that would throw an OutOfMemoryException. If null is returned the item doesn't exist.

I haven't profiled this but I would expect minor speed improvements, but larger memory use.

public class QuickLookup<T> where T : class
{
    private T[] _postives = new T[short.MaxValue + 1];
    private T[] _negatives = new T[short.MaxValue + 1];
    public T this[short key]
    {
        get
        {
            return key < 0 ? _negatives[(key * -1) - 1] : _postives[key];
        }
        set
        {
            if (key < 0)
                _negatives[key * -1] = value;
            else
                _postives[key] = value;
        }
    }
}
Tim Carter