How big is the performance gap between std::sort and std::stable_sort in practice?

views:

601

answers:

+2 Q:

How big is the performance gap between std::sort and std::stable_sort in practice?

Both should run in O(n log n), but in general sort is faster than stable_sort. How big is the performance gap in practice? Do you have some experience about that?

I want to sort a very large number of structs that have a size of about 20 bytes. The stability of the result would be nice in my case, but it is not a must. At the moment the underlying container is a plain array, perhaps it could be changed to a std::deque later on.

+2 A:

As it is trivial to replace one algorithm with the other, why don't you try them both and time the results?

anon 2009-05-01 10:49:40

You are right, this would be very reasonable. But one run takes a long time in my environment and tests only one input instance. I was interested in experiences of other people. Perhaps there is someone who made some systematic experiments about that.

SebastianK 2009-05-01 10:59:51

+2 A:

Big enough to warrant a separate function that does stable sort and not have std::sort() do it transparently.

shoosh 2009-05-01 11:05:51

+4 A:

According to the STL documentation, stable_sort() performs between NlogN comparisons when sufficient memory is available. When insufficient memory is available, it degrades to N((logN)^2) comparisons. Therefore it is roughly of the same efficiency as sort() (which performs O(NlogN) comparisons in both average and worst case) when memory is available.

For those interested, sort() uses an introsort (quicksort which switches to heapsort when the recursion reaches a certain depth) and stable_sort() uses a merge sort.

marcog 2009-05-01 11:07:05

LogN^2 ususually does not mean log(N^2) but (log N)^2, and especially so if it occurs in big-O notation. This commonly happens with algorithms that take O(N log N) steps with O(log N) work per step and vice versa. So, no, it's not a constant 2.

MSalters 2009-05-01 13:35:54

You're right, changed my answer considerably as a result.

marcog 2009-05-01 14:58:51

How big is the performance gap in practice? Do you have some experience about that?

Yes, but it didn't go the way you would expect.

I took a C implementation of the Burrows-Wheeler Transform and C++-ified it. Turned out a lot slower than the C code (though the code was cleaner). So I put timing instrumentation in there and it appeared that the qsort was performing faster than the std::sort. This was running in VC6. It was then recompiled with stable_sort and the tests ran faster than the C version. Other optimisations managed to push the C++ version ~25% quicker than the C version. I think it was possible to eke more speed but the clarity of the code was disappearing.

graham.reeds 2009-05-01 11:36:03

Basically what I am trying to say is try both ways.

graham.reeds 2009-05-01 13:56:49

Sometimes std::stable_sort() is needed because it maintains order of elements that are equal.

And the conventional advice is that, if maintaining order is not important, you should use std::sort() instead.

However, its context dependant. There is plenty of data that is best sorted with stable sort even if you don't need to maintain order:

Quicksort quickly becomes worst-case performance if the data has consistently poor pivot points.

The Burrows-Wheeler Transform is an algorithm used as part of data compression, e.g. bzip2. It requires sorting all the rotations of the text. For the majority of text data, merge sort (as often used by std::stable_sort()) is substantially faster than quicksort (as usually used by std::sort()).

bbb is a BWT implementation that notes the advantages of std::stable_sort() over sort() for this application.

Will 2009-05-01 12:10:56

ansaurus

tags:

views:

answers:

How big is the performance gap between std::sort and std::stable_sort in practice?

related questions