views:

19062

answers:

9

.Net has a lot of complex data structures. Unfortunately, some of them are quite similar and I'm not always sure when to use one and when to use another. Most of my C# and VB books talk about them to a certain extent, but never really go into any real detail.

What's the difference between Array, ArrayList, List, Hashtable, Dictionary, SortedList, and SortedDictionary?

Which ones are enumerable (IList -- can do 'foreach' loops)? Which ones use key/value pairs (IDict)?

What about memory footprint? Insertion speed? Retrieval speed?

Are there any other data structures worth mentioning?

EDIT: I'm still searching for more details on memory usage and speed (Big-O notation)

+7  A: 

If at all possible, use generics. This includes:

  • List instead of ArrayList
  • Dictionary instead of HashTable
Adam Tegen
+1  A: 

Actually I think the msdn help provides pretty good answers to all these questions. Just look up .net collections.

Scott
+1  A: 

They're spelled out pretty well in intellisense. Just type System.Collections. or System.Collections.Generics (preferred) and you'll get a list and short description of what's available.

Joel Coehoorn
+8  A: 

First, all collections in .NET implement IEnumerable.

Second, a lot of the collections are duplicates because generics were added in version 2.0 of the framework.

So, although the generic collections likely add features, for the most part:

  • List is a generic implementation of ArrayList.
  • Dictionary is a generic implementation of Hashtable

Arrays are a fixed size collection that you can change the value stored at a given index.

SortedDictionary is an IDictionary that is sorted based on the keys. SortedList is an IDictionary that is sorted based on a required IComparer.

So, the IDictionary implementations (those supporting KeyValuePairs) are: * Hashtable * Dictionary * SortedList * SortedDictionary

Another collection that was added in .NET 3.5 is the Hashset. It is a collection that supports set operations.

Also, the LinkedList is a standard linked-list implementation (the List is an array-list for faster retrieval).

Abe Heidebrecht
+21  A: 

Off the top of my head:

Array - represents an old-school memory array - kind of like a alias for a normal type[] array. Can enumerate. Can't grow automatically. I would assume very fast insertion and retriv. speed.

ArrayList - automatically growing array. Adds more overhead. Can enum., probably slower than a normal array but still pretty fast. These are used a lot in .NET

List - one of my favs - can be used with generics, so you can have a strongly typed array, e.g. List<string>. Other than that, acts very much like ArrayList.

Hashtable - plain old hashtable. O(1) to O(n) worst case. Can enumerate the value and keys properties, and do key/val pairs.

Dictionary - same as above only strongly typed via generics, such as Dictionary<string, string>

SortedList - a sorted generic list. Slowed on insertion since it has to figure out where to put things. Can enum., probably the same on retrieval since it doesn't have to resort, but deletion will be slower than a plain old list.

I tend to use List and Dictionary all the time - once you start using them strongly typed with generics, its really hard to go back to the standard non-generic ones.

There are lots of other data structures too - there's KeyValuePair which you can use to do some interesting things, there's a SortedDictionary which can be useful as well.

Sam Schutte
Hash Table is O(1), worst case (with collisions) can be O(n)
Justin Bozonier
+6  A: 

Here are a few general tips for you:

  • You can use foreach on types that implement IEnumerable. IList is essentially an IEnumberable with Count and Item (accessing items using a zero-based index) properties. IDictionary on the other hand means you can access items by any-hashable index.

  • Array, ArrayList, List and SortedList all implement IList. Dictionary, SortedDictionary, and Hashtable implement IDictionary.

  • If you are using .NET 2.0 or higher, it is recommended that you use generic counterparts of mentioned types.

  • For time and space complexity of various operations on these types, you should consult their documentation.

  • .NET data structures are in System.Collections namespace. There are type libraries such as PowerCollections which offer additional data structures.

  • To get a thorough understanding of data structures, consult resources such as CLRS.

blackwing
A: 

Hashtables/Dictionaries are O(1) performance, meaning that performance is not a function of size. That's important to know.

EDIT: In practice, the average time complexity for Hashtable/Dictionary<> lookups is O(1).

Christopher
There is no such thing as "performance". The complexity depends on operation. For example, if you insert n elements into Dictionary<>, it will not be O(1) due to rehashing.
Ilya Ryzhenkov
A: 

There are subtle and not-so-subtle differences between generic and non-generic collections. They merely use different underlying data structures. For example, Hashtable guarantees one-writer-many-readers without sync. Dictionary does not.

Ilya Ryzhenkov
+1  A: 

The generic collections will perform better than their non-generic counterparts, especially when iterating through many items. This is because boxing and unboxing no longer occurs.

Russ Cam