When you need order, use an ordered container. There is no point in paying the cost of sorting later on.
Your current solution is:
- Insertion
O(1)
- Retrieval
O(N log N)
- Removal
O(N)
(which is as good as you can get without keeping another index there)
Simply using a std::multi_map
you could have:
- Insertion
O(log N)
- Retrieval
O(log N)
<-- much better isn't it ? We need to find the end of the range
- Removal
O(N)
Now, you could do slightly better with a std::map< key, std::vector<value> >
:
- Insertion
O(log M)
where M
is the number of distinct keys
- Retrieval
O(1)
(begin
is guaranteed to be amortized constant time)
- Removal
O(N)
You can't really push the random removal... unless you're willing to keep another index there. For example:
typedef std::vector<value_type> data_value_t;
typedef std::map<key_type, data_value_t> data_t;
typedef std::pair<data_t::iterator,size_t> index_value_t;
// where iterator gives you the right vector and size_t is an index in it
typedef std::unordered_map<value_type, index_value_t> index_t;
But keeping this second index up to date is error prone... and will be done at the expense of the other operations! For example, with this structure, you would have:
- Insertion
O(log M)
--> complexity of insertion in hash map is O(1)
- Retrieval
O(N/M)
--> need to de index all the values in the vector, there are N/M
in average
- Removal
O(N/M)
--> finding in hash map O(1)
, dereferencing O(1)
, removing from the vector O(N/M)
because we need to shift approximately half the content of the vector. Using a list
would yield O(1)
... but might not be faster (depends on the number of elements because of the memory tradeoff).
Also bear in mind that hash map complexity are amortized ones. Trigger a reallocation because you overgrew the load factor, and this particular insertion will take a very long time.
I'd really go with the std::map<key_type, std::vector<value_type> >
in your stead. That's the best bang for the buck.