views:

479

answers:

9

We have large (100,000+ elements) ordered vectors of structs (operator < overloaded to provide ordering):

std::vector < MyType > vectorMyTypes;
std::sort(vectorMyType.begin(), vectorMyType.end());

My problem is that we're seeing performance problems when adding new elements to these vectors while preserving sort order. At the moment we're doing something like:

for ( a very large set )
{
    vectorMyTypes.push_back(newType);
    std::sort(vectorMyType.begin(), vectorMyType.end());

    ...

    ValidateStuff(vectorMyType); // this method expects the vector to be ordered
}

This isn't exactly what our code looks like since I know this example could be optimised in different ways, however it gives you an idea of how performance could be a problem because I'm sorting after every push_back.

I think I essentially have two options to improve performance:

  1. Use a (hand crafted?) insertion sort instead of std::sort to improve the sort performance (insertion sorts on a partially sorted vector are blindingly quick)

  2. Create a heap by using std::make_heap and std::push_heap to maintain the sort order

My questions are:

  • Should I implement an insertion sort? Is there something in Boost that could help me here?

  • Should I consider using a heap? How would I do this?


Edit:

Thanks for all your responses. I understand that the example I gave was far from optimal and it doesn't fully represent what I have in my code right now. It was simply there to illustrate the performance bottleneck I was experiencing - perhaps that's why this question isn't seeing many up-votes :)

Many thanks to you onebyone.livejournal.com, it's often the simplest answers that are the best, and perhaps it was my over analysis of the problem that blinded me to perhaps the most obvious solution. I do like the neat method you outlined to insert directly into a pre-ordered vector.

As I've commented, I'm constrained to using vectors right now, so std::set, std::map, etc aren't an option.

+3  A: 

Why not just use a binary search to find where to insert the new element? Then you will insert exactly into the required position.

sharptooth
Agree, but still have the overhead of reallocation of space for vector, and having to shift elements following insertion down 1 in the new vector.
John Pirie
Reallocation is amortisized constant time, and if you know the number of inserted elements upfront, you can reallocate in one go.
xtofl
Yes, but it does invalidate whatever pointer / iterator you had prior.
Gab Royer
+5  A: 

Ordered insertion doesn't need boost:

vectorMyTypes.insert(
    std::upper_bound(vectorMyTypes.begin(), vectorMyTypes.end(), newType),
    newType);

upper_bound provides a valid insertion point provided that the vector is sorted to start with, so as long as you only ever insert elements in their correct place, you're done. I originally said lower_bound, but if the vector contains multiple equal elements, then upper_bound selects the insertion point which requires less work.

This does have to copy O(n) elements, but you say insertion sort is "blindingly fast", and this is faster. If it's not fast enough, you have to find a way to add items in batches and validate at the end, or else give up on contiguous storage and switch to a container which maintains order, such as set or multiset.

A heap does not maintain order in the underlying container, but is good for a priority queue or similar, because it makes removal of the maximum element fast. You say you want to maintain the vector in order, but if you never actually iterate over the whole collection in order then you might not need it to be fully ordered, and that's when a heap is useful.

Steve Jessop
good idea, but doesn't need lower_bound a third argument?
xtofl
Yes. Yes it does.
Steve Jessop
And the fact that upper_bound is O(n) (IIRC), compared to binary_search which is O(log n) doesn't matter as the insertion is O(n)
Gab Royer
upper_bound performs a binary search, and the standard guarantees that it is O(log N) for random access iterators. The difference is that it returns an iterator whereas `binary_search` returns `bool`.
Steve Jessop
A: 

Why not to use boost::multi_index ?

NOTE: boost::multi_index does not provide memory contiguity, a property of std::vectors by which elements are stored adjacent to one another in a single block of memory.

Kirill V. Lyadvinsky
A: 

There are a few things you need to do.

  1. You may want to consider making use of reserve() to avoid excessive re-allocing of the entire vector. If you have knowledge of the size it will grow to, you may gain some performance by doing resrve()s yourself (rather than having the implemetation do them automaticaly using the built in heuristic).

  2. Do a binary search to find the insertion location. Then resize and shift everything following the insertion point up by one to make room.

  3. Consider: do you really want to use a vector? Perhaps a set or map are better.

The advantage of binary search over lower_bound is that if the insertion point is close to the end of the vector you don't have to pay the theta(n) complexity.

Ari
vector already has a strategy for reallocating the buffer in large increments. Unless the user knows the final size in advance reserve() will not usually help much.
sharptooth
That's what judicious means. If he knows better, let him `reserve`. If not, not. Experimentation will yield the correct answer.
Ari
But you won't have to "realloc and copy the entire vector each time" if you let it do its default thing. So I think you may have presented a straw man argument in favour of the programmer using `reserve()`.
Steve Jessop
It's just a question of how good the built-in heuristic is and the time/memory tradeoff. I'm just saying it *may* be an issue. I guess I'll soften the language.
Ari
A: 

Using a binary search to find the insertion location isn't going to speed up the algorithm much because it will still be O(N) to do the insertion (consider inserting at the beginning of a vector - you have to move every element down one to create the space).

A tree (aka heap) will be O(log(N)) to insert, much better performance.

See http://www.sgi.com/tech/stl/priority_queue.html

Note that a tree will still have worst case O(N) performance for insert unless it is balanced, e.g. an AVL tree.

Larry Watanabe
Asympotitacally you are correct, but in practice, if the insertion point is often anough close enough to the end of the `vector` there are substantial gains to be had by using binary search. (Bear in mind that comparisons are done via an ooverloaded function call and are potentially quite expensive.)
Ari
Why settle for O(N) insertion (both worst case and average case) when you can get O(log(N)) performance?
Larry Watanabe
Because he may need lots of random access reads, which are O(1) with a `vector` as opposed to logarithmic with `set`/`map`
Ari
+1  A: 

If you need to insert a lot of elements into a sorted sequence, use std::merge, potentially sorting the new elements first:

void add( std::vector<Foo> & oldFoos, const std::vector<Foo> & newFoos ) {
    std::vector<Foo> merged;
    // precondition: oldFoos _and newFoos_ are sorted
    merged.reserve( oldFoos.size() + newFoos.size() ); // only for std::vector
    std::merge( oldFoos.begin(), oldFoos.end(),
                newFoos.begin(), newFoos.end(),
                std::back_inserter( merged );
    // apply std::unique, if wanted, here
    merged.erase( std::unique( merged.begin(), merged.end() ), merged.end() );
    oldFoos.swap( merged ); // commit changes
}
+1 if the user's requirements allow multiple new elements to be added in batches.
Steve Jessop
+5  A: 

According to item 23 of Meyers' Effective STL, you should use a sorted vector if you application use its data structures in 3 phases. From the book, they are :

  1. Setup. Create a new data structure by inserting lots of elements into it. During this phase, almost all operation are insertions and erasure. Lookups are rare on nonexistent
  2. Lookup. Consult the data structure to find specific pieces of information. During this phase, almost all operations are lookups. Insertion and erasures are rare or nonexistent. There are so many lookups, the performance of this phase makes the performance of the other phases incidental.
  3. Reorganize. Modify the content of the data structure. perhaps by erasing all the current data and inserting new data in its place. Behaviorally, this phase is equivalent to phase 1. Once this phase is completed, the application return to phase 2

If your use of your data structure resembles this, you should use a sorted vector, and then use a binary_search as mentionned. If not, a typical associative container should do it, that means a set, multi-set, map or multimap as those structure are ordered by default

Gab Royer
Exactly what I was about to say.
Edouard A.
it sounds like the requirement is for the set to be ordered as well, however
Larry Watanabe
Isn't that what the asker wanted in the first place?
Gab Royer
Or multi_set, we don't know whether equal elements are permitted.
Steve Jessop
Unless you've profiled your app and found these lookups are a key performance (or space) problem, you should *always* prefer to use a set rather than a sorted vector: it makes your code clearer and more maintainable.
chrispy
Unfortunately I'm constrained by technical reasons to using vectors - It was my first thought also! :)
Alan
*Unless you've profiled your app and found these lookups are a key performance* -- Agreed
Gab Royer
A: 
  1. If you want insert an element into the "right" position, why do you plan on using sort. Find the position using lower_bound and insert, using, well, `insert' method of the vector. That will still be O(N) to insert new item.

  2. heap is not going to help you, because heap is not sorted. It allows you get get at the smallest element quickly, and then quickly remove it and get next smallest element. However, the data in heap is not stored in sort order, so if you have algorithms that must iterate over data in order, it will not help.

I am afraid you description skimmed to much detail, but it seems like list is just not the right element for the task. std::deque is much better suited for insertion in the middle, and you might also consider std::set. I suggest you explain why you need to keep the data sorted to get more helpful advice.

Vladimir Prus
A: 

You might want to consider using a BTree or a Judy Trie.

  • You don't want to use contiguous memory for large collections, insertions should not take O(n) time;
  • You want to use at least binary insertion for single elements, multiple elements should be presorted so you can make the search boundaries smaller;
  • You do not want your data structure wasting memory, so nothing with left and right pointers for each data element.
Stephan Eggermont