tags:

views:

1217

answers:

4

Hi,

I have a question with hash_map and map in C++. I understand that map is in STL but hash_map is not a standard. What's the difference of them two?

Any insights are strongly welcomed.

Thanks!

+1  A: 

The C++ spec doesn't say exactly what algorithm you must use for the STL containers. It does, however, put certain constraints on their performance, which rules out the use of hash tables for map and other associative containers. (They're most commonly implemented with red/black trees.) These constraints require better worst-case performance for these containers than hash tables can deliver.

Many people really do want hash tables, however, so hash-based STL associative containers have been a common extension for years. Consequently, they added unordered_map and such to later versions of the C++ standard.

Warren Young
It was actually added in TR1 (std::tr1::unordered_map), not C++0x
Terry Mahaffey
Edited to fix it.
Warren Young
I thought that the reason `map` is generally a balanced btree was due to using `operator<()` as the means of determining location.
KitsuneYMG
@kts: Do any STL implementations actually use a B-tree?
bk1e
Technically all binary search trees are b-trees (a 1-2 tree). That being said, I don't know of any STL that uses anything other than red/black
KitsuneYMG
+8  A: 

They are implemented in very different ways.

hash_map (unordered_map in TR1 and Boost; use those instead) use a hash table where the key is hashed to a slot in the table and the value is stored in a list tied to that key.

map is implemented as a balanced binary search tree (usually a red/black tree).

An unordered_map should give slightly better performance for accessing known elements of the collection, but a map will have additional useful characteristics (e.g. it is stored in sorted order, which allows traversal from start to finish). Unordered_map will be faster on insert and delete than a map.

Joe
I don't fully agree with you regarding the performance. The performance is influenced by a number of parameters and I would scold any programmer using an unordered_map for a mere 10 entries because "It's faster". Worry about interface / functionality first, performance later.
Matthieu M.
Well, yes, it helps if you understand your problem. Up to certain orders of magnitude it is probably a wash performance-wise, but it is important to understand the performance characteristics of both containers as they do deviate in different ways as data volumes get larger.
Joe
Interestingly, I just swapped a std::map with a boost::unordered_map in an application in which I do a lot of random lookups, but also iterate over all the keys in the map. I saved a large amount of time in lookup, but gained it back via the iterations, so I switched back to map and am looking for other ways to improve application performance.
Erik Garrison
+6  A: 

hash_map was a common extension provided by many library implementations. That is exactly why it was renamed to unordered_map when it was added to the C++ standard as part of TR1. map is generally implemented with a balanced binary tree like a red-black tree (implementations vary of course). hash_map and unordered_map are generally implemented with hash tables. Thus the order is not maintained. unordered_map insert/delete/query will be O(1) (constant time) where map will be O(log n) where n is the number of items in the data structure. So unordered_map is faster, and if you don't care about the order of the items should be preferred over map. Sometimes you want to maintain order (ordered by the key) and for that map would be the choice.

janglin
I would point out that hashmap has a worst case access of O(N) when collisions are likely (bad hash fcn, loading factor too high, etc)
KitsuneYMG
+1  A: 

Some of the key differences are in the complexity requirements.

A map requires O(log(N)) time for inserts and finds.

An unordered_map requires an 'average' time of O(1) for inserts and finds but is allowed to have a worst case time of O(N).

So, usually, unordered_map will be faster, but depending on the keys and the hash function you store, can become much worse.

R Samuel Klatchko