tags:

views:

266

answers:

3

I have a problem where I need a .NET dictionary that supports multiple items per key. In the past I've used the STL multimap in my C++ programs. How does the design of a multimap differ from a dictionary of lists i.e. performance, size, etc. (excluding generics vs. templates)?

A: 

Define some kind of key class and override it's Equals and GetHashCode methods. Then you can use it as a key in a dictionary:

class MyKey  : IEquatable<MyKey>
{
    public int x;
    public string y;


    public bool Equals(MyKey other)
    {
        return other != null && x == other.x && y == other.y;
    }

    public override int GetHashCode()
    {
        return x.GetHashCode() ^ y.GetHashCode();  // for example
    }
}


Dictionary<MyKey, Foo>  dict;
yodaj007
This doesn't answer the question.
Jason
+1  A: 

multimap: insert/remove operations take logarithmic time Big-O(log n), lookup take constant time Big-O(1).

Each element is accessed using a key value, unlike the map, a multimap key value needs not be unique, associated values do not need to be unique. The map orders the elements by their keys using a stored function key_compare, which simply does a less-than comparison.

Here's an article on IDictionary performance which doesn't mention any Big-O performance metrics but does give some practical runs using the dictionary.

Andy
Of course the documentation for `IDictionary` does not mention any asymptotic guidelines; an interface can not require all implementers to meet certain asymptotic performance requirements. Note further from the documentation for `Dictionary<TKey, TValue>` that "[i]f `Count` is less than the capacity, [`Add`] approaches an `O(1)` operation. If the capacity must be increased to accommodate the new element, this method becomes an `O(n)` operation, where `n` is `Count`." Further, "Retrieving a value by using its key is very fast, close to `O(1)`." Why they are vague on that last point eludes me.
Jason
It's probably because GetHashCode() is not always implemented correctly, and when there are lots of collisions it's no longer O(1).
Dan Berindei
+2  A: 

multimap.count: O(log n + m) where n is number of keys and m is number of items associated with a given key.

For a Dictionary<TKey, List<TValue>> the equivalent functionality would be:

int count = dictionary[key].Count;

And safer is to say

int count;
List<TValue> list;
if(dictionary.TryGetValue(key, out list)) {
    int count = list.Count;
}

This is an O(1) operation because lookup is O(1)1 and List<T>.Count is O(1).

multimap.find: O(log n) where n is number of keys

For a Dictionary<TKey, List<TValue>> the equivalent functionality would be:

List<TValue> elements = dictionary[key];

And safer is to say

List<TValue> list;
if(dictionary.TryGetValue(key, out list)) {
    // safe to iterate list
}

This is O(1). See the previous remark on lookup by key in a Dictionary<TKey, TValue>.

multimap.insert: O(log n) where n is the number of keys.

For a Dictionary<TKey, List<TValue>> the equivalent functionality would be:

// value is TValue to insert
List<TValue> list;
if(!dictionary.TryGetValue(key, out list)) {
    list = new List<TValue>();
    dictionary.Add(key, list);
}
list.Add(value);

This is usually O(1) but can be O(n) when the capacity of the dictionary must be increased to accomodate the new element.

multimap.remove: There are three overloads of this method; I will only consider the one that accepts a key and removes all occurrences of that key from the multimap. This is an O(log n + m) operation where there n keys and m objects associate with a given key.

For a Dictionary<TKey, List<TValue>> the equivalent functionality would be:

 dictionary.Remove(key);

From the documentation: "This method approaches an O(1) operation." Same comment applies.

1: From the documentation: "Retrieving a value by using its key is very fast, close to O(1)." Why the documentation is vague on this point is confusing to me. Either an operation is O(1) or it isn't. There is no such thing as "close" to O(1).

Jason