I'm looking for the most ideal data structure (for performance and ease of use) from which values can be retrieved by string key or index. Dictionary doesn't work because you can't really retrieve by index. Any ideas?
views:
726answers:
8My original answer of hash table is incorrect. It cannot have indexes. Move along.
Hash based collections (Dictionary, Hashtable, HashSet) are out because you won't have an index, since you want an index, I'd use a nested generic:
List<KeyValuePair<K,V>>
Of course, you lose the O(1) Key lookup that you get with hashes.
There's System.Collections.ObjectModel.KeyedCollection< string,TItem>, which derives from Collection< TItem>. Retrieval is O(1).
class IndexableDictionary<TItem> : KeyedCollection<string, TItem>
{ Dictionary<TItem, string> keys = new Dictionary<TItem, string>();
protected override string GetKeyForItem(TItem item) { return keys[item];}
public void Add(string key, TItem item)
{ keys[item] = key;
this.Add(item);
}
}
You are looking for something like the SortedList class (here's the generic version as well).
You want the OrderedDictionary class. You will need to include the System.Collections.Specialized namespace:
OrderedDictionary od = new OrderedDictionary();
od.Add("abc", 1);
od.Add("def", 2);
od.Add("ghi", 3);
od.Add("jkl", 4);
// Can access via index or key value:
Console.WriteLine(od[1]);
Console.WriteLine(od["def"]);
One word of warning. The OrderedDictionary
has really bad performance characteristics for most operations except insertion and lookup: Both removal and modification of a value may require a linear search of the whole list, resulting in runtime O(n). (For modification, this depends on whether access occurred by index or by key.)
For most operations with reasonable amounts of data, this is completely inacceptable. Furthermore, the data structure stores elements both in a linear vector and in a hash table, resulting in some memory overhead.
If retrieval by index doesn't happen too often, a SortedList
or SortedDictionary
will have much better performance characteristics (access by index can be achieved through the ElementAt
extension method).
If, on the other hand, access by index is the norm, then stop using dictionary data structures alltogether and simply store your values in a List<KeyValuePair<TKey, TValue>>
. Although this means a linear search for access by key, all other operations are very cheap and overall performance is hard to beat in practice.
/EDIT: Of course, the latter is also a dictionary data structure in the theoretical sense. You could even encapsulate it in a class implementing the appropriate interface.
A Dictionary could work with linq. Although i dont know about possible performance issues. Dictionary.ElementAt(index);
I recommend using SortedDictionary<string, TValue> or SortedList<string, TValue>. Both have O(log n) search performance.
The differences are, as quoted from the MSDN library:
SortedList<(Of <(TKey, TValue>)>) uses less memory than SortedDictionary<(Of <(TKey, TValue>)>).
SortedDictionary<(Of <(TKey, TValue>)>) has faster insertion and removal operations for unsorted data: O(log n) as opposed to O(n) for SortedList<(Of <(TKey, TValue>)>).
If the list is populated all at once from sorted data, SortedList<(Of <(TKey, TValue>)>) is faster than SortedDictionary<(Of <(TKey, TValue>)>).
In my experience SortedDictionary is more adequate for most typical business scenarios, since the data is usually initially unsorted when using structures like this, and the memory overhead of SortedDictionary is seldom critical. But if performance is key for you, I suggest you implement both and do measurements.