views:

88

answers:

3

I am confused about the time complexity of hash table many articles state that they are "amortized O(1)" not true order O(1) what does this mean in real applications. What is the average time complexity of the operations in a hash table, in actual implementation not in theory, and why are the operations not true O(1)?

A: 

For some uses of hash tables, it's impossible to create them of the "right" size in advance, because it is not known how many elements will need to be held simultaneously during the lifetime of the table. If you want to keep fast access, you need to resize the table from time to time as the number of element grows. This resizing takes linear time with respect to the number of elements already in the table, and is usually done on an insertion, when the number elements passes a threshold.

These resizing operations can be made seldom enough that the amortized cost of insertion is still constant (by following a geometric progression for the size of the table, for instance doubling the size each time it is resized). But one insertion from time to time takes O(n) time because it triggers a resize.

In practice, this is not a problem unless you are building hard real-time applications.

Pascal Cuoq
It's not only the size that's the consideration - it's also the hash collisions. There are different ways of dealing with them, but whatever you do it won't happen in O(1) time. The average case is still close to O(1) in practice though unless the hash table gets quite full
Jords
@Jords I do not know what "close to O(1)" means. Besides, I am pretty confident that the "amortized O(1)" found in the literature corresponds to hypotheses on the hash function where the bucket depth remains below a fixed bound, hence constant time. Because if the lookup without resizing was not constant time, the amortized lookup would certainly not be constant time either.
Pascal Cuoq
A: 

It's impossible to know in advance how many collisions you will get with your hash function, as well as things like needing to resize. This can add an element of unpredictability to the performance of a hash table, making it not true O(1). However, virtually all hash table implementations offer O(1) on the vast, vast, vast majority of inserts. This is the same as array inserting - it's O(1) unless you need to resize, in which case it's O(n), plus the collision uncertainty.

In reality, hash collisions are very rare and the only condition in which you'd need to worry about these details is when your specific code has a very tight time window in which it must run. For virtually every use case, hash tables are O(1). More impressive than O(1) insertion is O(1) lookup.

DeadMG
A: 

From Wikipedia:

Amortized Analysis finds the average running time per operation over a worst-case sequence of operations. Amortized analysis differs from average-case performance in that probability is not involved; amortized analysis guarantees the time per operation over worst-case performance.

Consider the worst case, where all the hashes coincide, in a hash table of size m(m<<n). Let the entry that you are looking for be the last one inserted. Hence, you would have to search m entries (ie.the whole table), to get to your desired entry. If you wish to lookup another entry, you can be sure that it would take less than m searches to get to your desired entry.

Let us assume that you perform a lookup n times, & always look for the last entry inserted. Hence, the worst-case sequence of operations gives, m + m + ... + m n times, which is O(n).

Hence, the amortized time per operation is,

  total time complexity       O(n)  
__________________________  = ____ = O(1)
total number of operations     n
Kedar Soparkar
I do not follow the "Hence" part of your reasoning.`1+2+...+n` is not `O(n)` but `O(n^2)`, and in the degenerate case where all entries clash to the same bucket, the amortized time is definitely not `O(1)` but `O(n)`.
Pascal Cuoq
@Kedar Soparkar 1+2+..+(n-1)+n is indeed exactly n(n+1)/2, and asymptotically O(n^2). If you think otherwise, I recommend you pick up any good book on elementary arithmetic, or http://www.enotes.com/math/q-and-a/using-mathematical-induction-prove-that-sum-first-205275
Pascal Cuoq
@Pacal Cuoq, sorry for that aberration, I hope you read my corrected answer.
Kedar Soparkar