views:

342

answers:

7

Hi,

I need to optimize some code that sorts a vector<pair<int, float >>a where the pairs needs to be sorted on the float value. The vector will have an length between 0 and 5. I've been googling and reading up on sorting methods in C++ but cannot find any benchmarks on sorting tiny data sets. For the system it's important be as fast as possible as it's used for a real time blob tracking system.

Kind regards, Pollux

+3  A: 

For the data you have mentioned (0-5), use STL sort methods. ( historically qsort based)

stable_sort -- if you want to maintain the order for duplicates. ( historically merge sort based)

aJ
+3  A: 

Any particular reason why you need this code optimized? For n==5, pretty much any sort will do. Even Bogosort should have a reasonable runtime, since there are only 5! == 120 possible permutations in your data. Have you profiled your algorithm to find out whether this is a bottleneck?

Chinmay Kanchi
+6  A: 

Insertion sort and Bubble sort are great for tiny data pairs.

Another option is to hard code the compare logic, with a couple if statements.
Check out What is the fastest possible way to sort an array of 7 integers? for some ideas.

Nick D
Doesn't Insertion sort always beat Bubble sort?
Nikko
@Nikko, yeah insertion would be a better choice in general. However, if you implement bubble sort inner loop *properly* (optimized), I don't think it'll be any performance difference on such small data sets.
Nick D
+6  A: 

It makes no sense to read about benchmarks. You can read about and compare the complexity (Big-O) of algorithms, because it only depends on the algorithms themselves, but benchmarking is something that depends on too many factors.

You need to do the benchmarking for yourself, using the tools that you use, in the environment that matters to the users of your application.

sbi
+1  A: 

If you're really certain (that is, you have measured) that this is a bottleneck and needs to be optimized, and just any sort algorithm from STL won't be fast enough (and you've measured that too), then perhaps you know something more that you can use? Standard sorting algorithms are general, and will work (reasonably well) for any data set: different numbers of elements, different ranges of values, and so on. If you really need performance, the trick is often to use additional information to make a more specialized algorithm.

Here you have said one thing: there are 0-5 elements to sort, As Nick D said in his answer, perhaps this will make it faster to use hard-coded if statements instead of the typical loops or recursion of general sorting algorithms.

But perhaps there is more? For example, are there any restrictions on the float values that can occur?

Thomas Padron-McCarthy
Hi everyone! Thanks a lot for this usefull info!
pollux
+2  A: 

First, premature optimization is the root of all evil. That is, first benchmark your code and make sure the sorting is the one that's actually taking the most time. If another part of your performance-critical code is taking 80% of the execution time, you will get drastic performance improvements optimizing that first.

Considering you have 5 elements, the point is pretty much moot. Any algorithm you use (except bogosort :D) should have a pretty much constant execution time, unless you run the algorithm a few hundred times per second, or more.

Second, provided you still want to optimize the search, if you know for sure you always will have 5 elements, the optimal method is by hard-coding the comparations (It can be proven mathematically that this method is optimal, providing a minimal number of executed operations in the worst case scenario - we had this as a laboratory application in university). The same applies if you have less than five elements, but the algorithm becomes prohibitive to write and execute if you have more than seven elements - the logic is convoluted to write and follow so your code will become difficult to maintain.

Let me know if you need help with writing the optimal hard-coded algorithm itself (though I think you should find the implementation and theory online).

If you do not always have five numbers (or for some reason want to avoid the hardcoded comparisions algorithm), use the sort provided by the library. This page concludes that the stl sort is optimal in terms of time (not only number of executed operations). I do not know what implementation was used for search.

utnapistim
Small complaint: "constant execution time" doesn't mean fast, it just means constant, so it's not necessarily a good thing if it is "pretty much constant", and it won't become less constant if you repeat it a few hundred times.
Thomas Padron-McCarthy
I wouldn't trust that page that shows STL sort is optimal. For one thing, I couldn't find the source for all of the sorts used. There is a file `d_sort.h` missing that I guess comes with that book there. Second, I have a single-pivot, recursive quicksort that is significantly faster than std::sort. Plus, there is a dual-pivot quicksort (not of my own creation) that beats my quicksort and std::sort. That said, std::sort is fairly fast, but it's good to be wary of studies and scrutinize them to see how they were done.
Justin Peel
+2  A: 

Use a somewhat nasty set of ifs for the fastest sort of 5 items, or if you want a sort that is just a little bit slower, but much less of a headache, you can use a sorting network. Use this site with the number of inputs equal to five to get an optimized sorting network. It even shows how you can do some parts in parallel though that seems a bit excessive for only 5 inputs. It will require 9 comparisons which is quite good. You will implement the sorting network with a set of ifs as well, but the difference between the somewhat nasty set of ifs, which I mentioned at the beginning, which truly optimal and an optimal sorting network is the number of swaps: the sorting network doesn't minimize the number of swaps.

Justin Peel