views:

142

answers:

4

I've been coding in c++ and java entirety of my life but on C#, I feel like it's a totally different animal.

In case of hash collision in Dictionary container in c#, what does it do? or does it even detect the collision?

In case of collisions in similar containers in SDL, some would make a key value section link data to key value section like linked list, or some would attempt to find different hash method.

[Update 10:56 A.M. 6/4/2010]

I am trying to make a counter per user. And set user # is not defined, it can both increase or decrease. And I'm expecting the size of data to be over 1000.

So, I want :

  • fast Access preferably not O(n), It important that I have close to O(1) due to requirement, I need to make sure I can force log off people before they are able to execute something silly.
  • Dynamic growth and shrink.
  • unique data.

Hashmap was my solution, and it seems Dictionary is what is similar to hashmap in c#...

A: 

I believe it will resize the underlying array to be twice the size then re-hashes and will likely get an open bucket.

Jesse C. Slicer
so it's guaranteed to be protected from collision cases? and is there any way to change the multiplicity factor to something less than 2 in case of limited memory?
Ankiov Spetsnaz
Actually, I think that the OP is correct: the hash size is fixed, and a collision converts that bucket into a linked list or a b-tree. But I'm not sure.
JSBangs
Interesting. The `Hashtable` class does it differently than the generic `Dictionary` class.
Jesse C. Slicer
+2  A: 

According to this article at MSDN, in case of a hash collision the Dictionary class converts the bucket into a linked list. The older HashTable class, on the other hand, uses rehashing.

JSBangs
A: 

Check this link for a good explanation: An Extensive Examination of Data Structures Using C# 2.0

Basically, .NET generic dictionary chains items with the same hash value.

Groo
+4  A: 

Hash collisions are correctly handled by Dictionary<> - in that so long as an object implements GetHashCode() and Equals() correctly, the appropriate instance will be returned from the dictionary.

First, you shouldn't make any assumptions about how Dictionary<> works internally - that's an implementation detail that is likely to change over time. Having said that....

What you should be concerned with is whether the types you are using for keys implement GetHashCode() and Equals() correctly. The basic rules are that GetHashCode() must return the same value for the lifetime of the object, and that Equals() must return true when two instances represent the same object. Unless you override it, Equals() uses reference equality - which means it only returns true if two objects are actually the same instance. You may override how Equals() works, but then you must ensure that two objects that are 'equal' also produce the same hash code.

From a performance standpoint, you may also want to provide an implementation of GetHashCode() that generates a good spread of values to reduce the frequency of hashcode collision. The primarily downside of hashcode collisions, is that it reduces the dictionary into a list in terms of performance. Whenever two different object instances yield the same hash code, they are stored in the same internal bucket of the dictionary. The result of this, is that a linear scan must be performed, calling Equals() on each instance until a match is found.

LBushkin
FWIW, you can use Redgate .NET Reflector to look at the actual implementation, but LBushkin is correct, it's likely to change over time, so don't count on it.
Aren
But do you know whether it would double the hashmap capacity in case of collision?? Cause that maybe too expensive for me.
Ankiov Spetsnaz
Looking at the code, it looks like the `.Resize()` function is only called when the entire dictionary is full. The current implementation seems to find the NEXT bucket when a collision happens, but this is just my interpretation of reverse-engineered IL, so make of that what you will.
Aren
@Ankiov: What runtime environment are you with what kind of data loads, that you are concerned about this? Are you on the .NET Compact framework? The implementations may be different than the client version (for instance on Windows Phone there's a slightly different version of the BCL and CLR).
LBushkin
@LBushkin: It be on the just normal .NET framework, not the mobile or any compact ones. But what worries me is that data size can grow significantly.@Aren B: thanks much!
Ankiov Spetsnaz