tags:

views:

320

answers:

2

Does anyone know the answer to this question?

+5  A: 

Yes, still constant time (amortized).

Justice
what does amortized mean
SuperString
In theory... Memory paging could be an issue. Unlikely but possible.
Matthew Whited
Amortized means that some individual inserts may take a longer time than others, but the average time remains constant.
JSBangs
It means that some accesses will take longer, but if you look at the performance over all queries the running time is O(1). That is, that over a sequence of accesses the average access time will be constant: http://en.wikipedia.org/wiki/Amortized_analysis
tvanfosson
+5  A: 

Yes. To search a hash map with 100 million items added to it, you do this:

1) Calculate the hash of the object you're looking for.
2) Find that bucket
3) Search through that bucket for the item.

(1) is independent of the size of the hash map or number of items in it.
(2) is O(1), assuming a standard hashmap implemented as an array of linked lists.
(3) takes an amount of time related to the number of items in the bucket, which should be approximately (number of items added to hash) / (number of buckets). This part will start at O(1), but will very slowly increase as the number of items begins to greatly exceed the number of buckets.

For almost any purpose, Hash Maps can be considered O(1) for both insertion and retrieval, even with very large data sets, as long as you start with a sufficiently large number of buckets.

Aric TenEyck
+1 good answer, point (3) is important
Paolo
And provided the hash is evenly distributed for your data set.
Yann Schwartz
is there a way to increase the number of buckets in C++ so that each bucket only has 1 element?
SuperString
@Danny: That depends on your hash map algorithm. I'm not familiar with the C++ STL (assuming that's what you're using), so I can't say.That said, increasing to the point where each bucket only has one element is overkill - the Birthday Paradox says that for 100 million elements, you'd need a hash map with quadrillions of entries to avoid duplicates, assuming a properly distributed random hash.
Aric TenEyck
You could use dynamic perfect hashing.http://en.wikipedia.org/wiki/Dynamic_perfect_hashing
Michael Munsey