views:

94

answers:

4

I'm looking for a STL container that works like std::multimap, but has constant access time to random n-th element. I need this because I have such structure in memory that is std::multimap for many reasons, but items stored in it have to be presented to the user in a listbox. Since amount of data is huge, I'm using list box with virtual items (i.e. list control polls for value at line X).

As a workaround I'm currently using additional std::vector to store "indexes" into std::map, and I fill it like this:

std::vector<MMap::data_type&> vec;
for (MMap::iterator it = mmap.begin(); it != mmap.end(); ++it)
    vec.push_back((*it).second);

But this is not very elegant solution.

Is there some such containter?

+4  A: 

What you need is: Boost Multi-Index

Autopulated
My I guess is that I need to combine sequenced and regular index in the same container. However, I get weird errors from compiler when I try that. Is this possible?
Milan Babuškov
Yes, that's possible: have a look at this example: http://www.boost.org/doc/libs/1_42_0/libs/multi_index/example/sequenced.cpp
Autopulated
Thanks everyone for answers. Unfortunately, I can accept only one, so this will be it.
Milan Babuškov
+2  A: 

How many items are in that list, what type are the items of, and how often do you have to insert or delete in the middle of it? Depending on this, you might be fine with using a sorted std::vector<std::pair<key,value>> and using std::binary_search with a search predicate comparing only the keys.

sbi
Items: up to 20milion, type is a class, I can live if insert or delete in middle is slow.
Milan Babuškov
@Milan: Do you indeed store objects of that class, or pointers? 20 million is a lot, I guess, even if the class is small. You might have to profile. Why don't ou write your own STL container providing what operations you need. Then you could switch its implementation from `std::map`+`std::vector` (what you have now) to `boost::multi_index::multi_index_container` to `std::vector<std::pair<const key,value>>` or whatever else you'll find and measure which one performs best. This would also allow you to go on now and come back for optimizations later.
sbi
Best of luck to manipulate a `std::pair<const Key, Value>` in a vector. The type not being assignable really makes it a pain ;)
Matthieu M.
@Matthieu: Is that any different from `std::map`? As far as I know you by now I'm sure I'm missing something. Could you please elaborate?
sbi
The problem is that the map allocates a node and never moves the items around while reallocation of the buffer or insertion in the middle will cause elements to be shuffled around which requires the elements to be Assignable. Take a look at how Alexandrescu dealt with the issue in its AssocVector: http://loki-lib.sourceforge.net/html/a00645.html. I think you can use a `std::pair<Key,Value>` in the inside (to respect the requirements) and some `reinterpret_cast` trickery to make the caller believe the type is actually `std::pair<const Key, Value>` ;)
Matthieu M.
@Matthieu: Ah, I didn't think of that. You're right, of course. Well, I guess the easiest way to deal with this is to drop the `const`, so I'll edit my answer accordingly.
sbi
I could also store pointers instead of items.
Milan Babuškov
@Milan: If you did that, the problem of inserting/removing in the middle of the vector would become one of moving some memory. IMO that would definitely make it worth to try the vector.
sbi
+1  A: 

On top of the requirements, it seems from your comments that you also plan to insert / delete items. I must admit 20 millions seems quite a lot.

Now, I understand the idea of polling, but have you consider something like unordered_multimap ? Instead of polling at a position, you can poll at a key in O(1) though with a slightly bigger overhead depending on the key type.

The main advantage is not to have to deal with keeping 2 contains in sync. Of course it does not work if you want the content sorted.

Thus, if you want the content sorted plus fast (not O(1)) access to a random position, you could consider B+Tree or Radix Tree. Their idea is to keep items in contiguous zones of memory, a few hundreds at a time.

That's just of the top of my head. Consider autopopulated answer's if you want a baked in solution :)

Matthieu M.
Well, let's say it's up-to-20mil. Most real cases will be in the range of 300.000 to 2mil. records. Key is a string, btw.
Milan Babuškov
Well for `std::string` hashing does add some overhead, so it's a big O(1).
Matthieu M.
A: 

Try hash_multimap. Hashes offer roughly constant time. Hash_multimap is an extension in Visual Studio, and I'm fairly sure that GCC offers a similar extension too. There's hash somethings in Boost if you're desperate.

DeadMG
How does this help with ordering items which need to be accessed by a key, too?
sbi
Hash maps hash a key. How else do they find anything?
DeadMG