views:

221

answers:

6

I have an array of 1000-2000 elements which are pointers to objects. I want to keep my array sorted and obviously I want to do this as quick as possible. They are sorted by a member and not allocated contiguously so assume a cache miss whenever I access the sort-by member.

Currently I'm sorting on-demand rather than on-add, but because of the cache misses and [presumably] non-inlining of the member access the inner loop of my quick sort is slow.

I'm doing tests and trying things now, (and see what the actual bottleneck is) but can anyone recommend a good alternative to speeding this up? Should I do an insert-sort instead of quicksorting on-demand, or should I try and change my model to make the elements contigious and reduce cache misses? OR, is there a sort algorithm I've not come accross which is good for data that is going to cache miss?

Edit: Maybe I worded this wrong :), I don't actually need my array sorted all the time (I'm not iterating through them sequentially for anything) I just need it sorted when I'm doing a binary chop to find a matching object, and doing that quicksort at that time (when I want to search) is currently my bottleneck, because of the cache misses and jumps (I'm using a < operator on my object, but I'm hoping that inlines in release)

+1  A: 

Running a quicksort on each insertion is enormously inefficient. Doing a binary search and insert operation would likely be orders of magnitude faster. Using a binary search tree instead of a linear array would reduce the insert cost.

Edit: I missed that you were doing sort on extraction, not insert. Regardless, keeping things sorted amortizes sorting time over each insert, which almost has to be a win, unless you have a lot of inserts for each extraction.

If you want to keep the sort on-extract methodology, then maybe switch to merge sort, or another sort that has good performance for mostly-sorted data.

Mark Bessey
I'm not running a quicksort on each insertion;("on-add") I'm doing a sort when I need to find an element via a binary chop ("on-demand") So say, for every 100 add's, I sort it once.I'll try putting in Insert-sort code and see what difference that makes initially...
Soylent Graham
Though with a dynamic binary search tree, using (or implementing) some form of (self-)balancing would probably be a good idea.
JAB
@JAB You mean like std::set?
anon
@Neil: I assume so. I don't have that much experience with C++ and its standard library. (Yet.)
JAB
I realised I almost never do a swap when quicksorting these elements post-add, so I'm adding a binary-chop (or "skip-list" mentioned below) and insert (technically a swap rather than insert as I'm doing this post-add with a sort policy) system like you mentioned as well as a simple check to see if I've not broken the sort order by appending one element. Results to follow...
Soylent Graham
+1  A: 

I think the best approach in your case would be changing your data structure to something logarithmic and rethinking your architecture. Because the bottleneck of your application is not that sorting thing, but the question why do you have to sort everything on each insert and try to compensate that by adding on-demand sort?.

Another thing you could try (that is based on your current implementation) is implementing an external pointer - something mapping table / function and sort those second keys, but I actually doubt it would benefit in this case.

Kotti
Well actually my last-resort is keeping a 2nd array of a simple Key/Index table (hashing table if you will) which would be simple and contiguous integers and presumably a much faster quicksort. Just would require extra overhead of the maintainence of the "key/Index" array...
Soylent Graham
A: 

Simple approach: insertion sort on every insert. Since your elements are not aligned in memory I'm guessing linked list. If so, then you could transform it into a linked list with jumps to the 10th element, the 100th and so on. This is kind of similar to the next suggestion.

Or you reorganize your container structure into a binary tree (or what every tree you like, B, B*, red-black, ...) and insert elements like you would insert them into a search tree.

DaClown
As far as I remember, *skip lists* provide the same (in Big-O terms) complexity as the trees do, but have worse constants and, hence, a tree would definitely suit better.
Kotti
I have an array of pointers rather than a linked list, so I guess a binary chop will give me better results than skipping.A seperate container/hash/key array is still plan B, after looking into search trees :)
Soylent Graham
"skip list", I forgot its name. You're right, a tree would be the better choice, but I think it would be also the choice that requires more work. Adding a few pointers to a generic object along with a consistency checker function sound like less work than a whole new structure. Nevertheless a tree would be more efficient.
DaClown
After some investigation I found in 90-99% of cases there was no sorting, so the quicksort was a worst-case scenario, and in this worst-case (largest) of my sorted arrays most elements fit near the end anyway, so inserting the new element at the right position on-add was the most efficient solution.I should have analyzed the data rather than profiling the code :)
Soylent Graham
@Kotti: don’t be so sure about the skip list constants. In fact, skip lists are much more cache friendly than trees. Their performance crucially depends on the quality and performance of the random number generator. If that is chosen well, the overall structure can beat most tree implementations hands down.
Konrad Rudolph
+2  A: 
andand
+1  A: 

Instead of the array of the pointers you may consider an array of structs which consist of both a pointer to your object and the sort criteria. That is:

Instead of

struct MyType {
    // ...
    int m_SomeField; // this is the sort criteria
};

std::vector<MyType*> arr;

You may do this:

strcut ArrayElement {
    MyType* m_pObj; // the actual object
    int m_SortCriteria; // should be always equal to the m_pObj->m_SomeField

};

std::vector<ArrayElement> arr;

You may also remove the m_SomeField field from your struct, if you only access your object via this array.

By such in order to sort your array you won't need to dereference m_pObj every iteration. Hence you'll utilize the cache.

Of course you must keep the m_SortCriteria always synchronized with m_SomeField of the object (in case you're editing it).

valdo
A: 

I would think that sorting on insertion would be better. We are talking O(log N) comparisons here, so say ceil( O(log N) ) + 1 retrieval of the data to sort with.

For 2000, it amounts to: 8

What's great about this is that you can buffer the data of the element to be inserted, that's how you only have 8 function calls to actually insert.

You may wish to look at some inlining, but do profile before you're sure THIS is the tight spot.

Matthieu M.