tags:

views:

859

answers:

7

I currently have a std::map<std::string,int> that stores an integer value to an unique string identifier, and I do look up with the string. It does mostly what I want, except for that it does not keep track of the insertion order. So when I iterate the the map to print out the values, they are sorted according to the string; but I want them to be sorted according to the order of (first) insertion.

I thought about using a vector<pair<string,int>> instead, but I need to look up the string and increment the integer values about 10,000,000 times, so I don't know whether a vector will be significantly slower.

Is there a way to use std::map or is there another std container that better suits my need?

[I'm on GCC 3.4, and I have probably no more than 50 pairs of values in my std::map].

+1  A: 

You cannot do that with a map, but you could use two separate structures - the map and the vector and keep them synchronized - that is when you delete from the map, find and delete the element from the vector. Or you could create a map<string, pair<int,int>> - and in your pair store the size() of the map upon insertion to record position, along with the value of the int, and then when you print, use the position member to sort.

Faisal Vali
+3  A: 

If you need both lookup strategies, you will end up with two containers. You may use a vector with your actual values (ints), and put a map< string, vector< T >::difference_type> next to it, returning the index into the vector.

To complete all that, you may encapsulate both in one class.

But I believe boost has a container with multiple indices.

xtofl
+9  A: 
Kirill V. Lyadvinsky
Thats great! Boost even has a member-selector to do the job!
xtofl
Yes, multi_index is my favorite feature in boost :)
Kirill V. Lyadvinsky
I've never used multi_index before, but IMHO that seems a bit overkill (with baroque syntax to boot) for a 50-element container.
Kristo
@Kristo, Future insert and find operations are very simple. And it's work very fast. hashed_unique index has O(1) complexity.
Kirill V. Lyadvinsky
@jia3ep, Oh I don't doubt that this implementation would be efficient and effective in the general case. But I wrote my entire answer for the OP's case in the same space you used to declare your container.
Kristo
@Kristo: it's not about container size, it's about reusing existing implementation for exactly this problem. That's classy. Admittedly, C++ isn't a functional language, so the syntax is somewhat elaborate.
xtofl
@Kristo, next to that, your succinct solution doesn't allow random accessing the counters themselves. (neither does mine, directly)
xtofl
Since when was programming about saving key strokes?
GMan
+1  A: 

You might combine a std::vector with a std::tr1::unordered_map (a hash table). Here's a link to Boost's documentation for unordered_map. You can use the vector to keep track of the insertion order and the hash table to do the frequent lookups. If you're doing hundreds of thousands of lookups, the difference between O(log n) lookup for std::map and O(1) for a hash table might be significant.

std::vector<std::string> insertOrder;
std::tr1::unordered_map<std::string, long> myTable;

// Initialize the hash table and record insert order.
myTable["foo"] = 0;
insertOrder.push_back("foo");
myTable["bar"] = 0;
insertOrder.push_back("bar");
myTable["baz"] = 0;
insertOrder.push_back("baz");

/* Increment things in myTable 100000 times */

// Print the final results.
for (int i = 0; i < insertOrder.size(); ++i)
{
    const std::string &s = insertOrder[i];
    std::cout << s << ' ' << myTable[s] << '\n';
}
Kristo
but of course, your can't access the counters by insert order...
xtofl
@xtofl, How does that make my answer not helpful and thus worthy of a downvote? Is my code incorrect in some way?
Kristo
A: 

This is somewhat related to Faisals answer. You can just create a wrapper class around a map and vector and easily keep them synchronized. Proper encapsulation will let you control the access method and hence which container to use... the vector or the map. This avoids using Boost or anything like that.

Polaris878
A: 

Another way to implement this is with a map instead of a vector. I will show you this approach and discuss the differences:

Just create a class that has two maps behind the scenes.

#include <map>
#include <string>

using namespace std;

class SpecialMap {
  // usual stuff...

 private:
  int counter_;
  map<int, string> insertion_order_;
  map<string, int> data_;
};

You can then expose an iterator to iterator over data_ in the proper order. The way you do that is iterate through insertion_order_, and for each element you get from that iteration, do a lookup in the data_ with the value from insertion_order_

You can use the more efficient hash_map for insertion_order since you don't care about directly iterating through insertion_order_.

To do inserts, you can have a method like this:

void SpecialMap::Insert(const string& key, int value) {
  // This may be an over simplification... You ought to check
  // if you are overwriting a value in data_ so that you can update
  // insertion_order_ accordingly
  insertion_order_[counter_++] = key;
  data_[key] = value;
}

There are a lot of ways you can make the design better and worry about performance, but this is a good skeleton to get you started on implementing this functionality on your own. You can make it templated, and you might actually store pairs as values in data_ so that you can easily reference the entry in insertion_order_. But I leave these design issues as an exercise :-).

Update: I suppose I should say something about efficiency of using map vs. vector for insertion_order_

  • lookups directly into data, in both cases are O(1)
  • inserts in the vector approach are O(1), inserts in the map approach are O(logn)
  • deletes in the vector approach are O(n) because you have to scan for the item to remove. With the map approach they are O(logn).

Maybe if you are not going to use deletes as much, you should use the vector approach. The map approach would be better if you were supporting a different ordering (like priority) instead of insertion order.

Tom
The map approach is also better if you need to get items by the "insertion id". For example, if you want the item that was inserted 5th, you do a lookup in insertion_order with key 5 (or 4, depending where you start counter_). With the vector approach, if the 5th item was deleted, you would actually get the 6th item that was inserted.
Tom
A: 

One thing you need to consider is the small number of data elements you are using. It is possible that it will be faster to use just the vector. There is some overhead in the map that can cause it to be more expensive to do lookups in small data sets than the simpler vector. So, if you know that you will always be using around the same number of elements, do some benchmarking and see if the performance of the map and vector is what you really think it is. You may find the lookup in a vector with only 50 elements is near the same as the map.

Chad Simpkins