views:

1259

answers:

2

How does performance for reading/adding values from/to Dictionary(Of String, SomeReferenceType) depend on the number of records already entered? I mean, does time increase as O(1), O(log n), O(n) when n goes to be large, or in some other way?

Dim index As New Dictionary(Of String, SomeReferenceType)

' N entries added in a loop
' ...

Dim id As Integer = 123456789 ' 9-digit number
Dim key As String = id.ToString()
Dim value As New SomeReferenceType

index(key) = value ' Need to estimate this
value = index.TryGetValue(key) ' and this operations depending on N (for large N)

Also, what happens if there is lack of memory? Should we set capacity of dictionary before entering elements to avoid copying it in case there is not enough memory space? How long does this operation (copy dictionary to a new place if needed) take depending on N?

+2  A: 

Dictionary is implemented with Hastables - Thus O(1).

SortedDictionary is implemented with binary trees (red/black) - Thus O(log n)

SortedList is implemented with lists+binary search - Thus O(n)

All dictionaries increase their size automatically (performance is O(n) here but this doesn't happen often because the size is precomputed intelligently, so the approximate performance is O(1) anyway - See this)

Just google them and look in the microsoft documentation.

Dario
"Dictionary is implemented with Hastables - Thus O(1)" - in fact, it cannot be constant, it should be increasingThe size doubles every time when needed, so there will be copying several times. It's interesting how much copying takes. As I've heard, in .NET it's pretty fast...
Roma
To be explicit on one point: there is no easy method to set the maximum capacity of a Dictionary. The capacity parameter sets the initial capacity, it is free to grow from there.
Vic E
Of course it will copy. This is no contradiction - it won't copy at any insertion but following a geometric progression.
Dario
+2  A: 

The performance for getting an item from a Dictionary is affected very little by the number of items. The items are divided into buckets accoding to their hash code, so there are generally only a single or very few items in each bucket. The operation is close to a O(1) operation.

Adding items is also close to an O(1) operation. If the capacity has to be increased you will get a performance hit, but by average that is very small. As the capacity doubles each time, the amount of data that is moved when increasing the capacity is really not that much. The data is by average moved 1.3 times extra, so the average extra work for each add comes down to moving about 16 bytes.

If you know how large the Dictionary will get, or just have a decent estimate, you should use that when you create it, to reduce or elliminate the need to increase the capacity.

Guffa
Thanks a lot, that's what I was asking for. How did you get this: "The data is by average moved 1.3 times extra, so the average extra work for each add comes down to moving about 16 bytes."? Also, as I've heard, there's some issue with capacity. Something like the capacity should be a prime number...
Roma
Well, the exact calculation was done for a List, but a Dictionary grows in a similar way. The capacity of a List doubles, so it grows at 32, 64, 128, 256 and so on. When it has grown to N, it has in total copied N items, and if by average half of the last allocated capacity is used, N is 1.3 times the current size. A Dictionary<object,object> on a 32-bit system uses 12 bytes per item in it's internal arrays, which times 1.3 comes to 15.6 bytes.
Guffa
The capacity of a dictionary is a prime number, but that is nothing that you need to think about, the Dictionary takes care of that by itself.
Guffa