views:

176

answers:

4

Often, it is more efficient to use a sorted std::vector instead of a std::set. Does anyone know a library class sorted_vector, which basically has a similar interface to std::set, but inserts elements into the sorted vector (so that there are no duplicates), uses binary search to find elements, etc.?

I know it's not hard to write, but probably better not to waste time and use an existing implementation anyway.

Update: The reason to use a sorted vector instead of a set is: If you have hundreds of thousands of little sets that contain only 10 or so members each, it is more memory-efficient to just use sorted vectors instead.

+3  A: 

I think there's not 'sorted container' adapter in the STL because there are already the appropriate associative containers for keeping things sorted that would be appropriate to use in nearly all cases. To be honest, about the only reason I can think of off the top of my head for having a sorted vector<> container might be to interoperate with C functions that expect a sorted array. Of course, I may be missing something.

If you feel that a sorted vector<> would be more appropriate for your needs (being aware of the shortcomings of inserting elements into a vector), here's an implementation on Code Project:

I've never used it, so I can't vouch for it (or its license - if any is specified). But a quick read of the article and it looks like the author at least made a good effort for the container adapter to have an appropriate STL interface.

It seems to be worth a closer look.

Michael Burr
+4  A: 

The reason such a container is not part of the standard library is that it would be inefficient. Using a vector for storage means objects have to be moved if something is inserted in the middle of the vector. Doing this on every insertion gets needlessly expensive. (On average, half the objects will have to be moved for each insertion. That's pretty costly)

If you want a sorted vector, it is likely better to insert all the elements, and then call std::sort() once, after the insertions.

jalf
So, update the sorted vector class for C++0x move semantics.
Ken Bloom
I dont see how that would solve the problem. All the objects still have to be touched, even if it is only a pointer swap. You're still trying to do something that the data structure just isn't suited for.
jalf
I started writing an answer like that, and stopped because it's simply not really true. For less than a few dozen elements, which is pretty common really, moving on average half can easily be less expensive than performing an allocation and a tree rebalance. Of course it's better to call `sort` once, and I personally wouldn't look for a container to do this, but it's a matter of style.
Potatoswatter
Inserting n elements into a sorted array is log n to find the insertion point and n/2 to move the existing elements, for n elements. O(n*n*log n), not efficient at all. Might work out if n is small enough though.
Mark Ransom
@Potatoswatter: Replacing it with a node-based datastructure wasn't my suggested alternative though. Like you say, the heap allocations and tree rebalancing gets pricey too (although a custom allocator could help somewhat). Sorting once, at the end, was my suggestion.
jalf
A: 

Alexandresu's Loki has a sorted vector implementation, if you dont want to go through the relativley insignicant effort of rolling you own.

Lance Diduck
Ah, this one: http://loki-lib.sourceforge.net/html/a00025.html. Thanks!
Frank
+2  A: 

If you decide to roll your own, you might also want to check out boost:ublas. Specifically:

#include <boost/numeric/ublas/vector_sparse.hpp>

and look at coordinate_vector, which implements a vector of values and indexes. This data structure supports O(1) insertion (violating the sort), but then sorts on-demand Omega(n log n). Of course, once it's sorted, lookups are O(logn). If part of the array is sorted, the algorithm recognizes this and sorts only the newly added elements, then does an inplace merge. If you care about efficiency, this is probably the best you can do.

Neil G