I m very curious, why stability is or is not important in sorting algorithms?
Any ideas?
I m very curious, why stability is or is not important in sorting algorithms?
Any ideas?
There's a few reasons why stability can be important. One is that, if two records don't need to be swapped by swapping them you can cause a memory update, a page is marked dirty, and needs to be re-written to disk (or another slow medium).
It depends on what you do.
Imagine you've got some people records with a first and a last name field. First you sort the list by first name. If you then sort the list with a stable algorithm by last name, you'll have a list sorted by last AND surname.
Background: a "stable" sorting algorithm keeps the items with the same sorting key in order. Suppose we have a list of 5-letter words:
peach straw apple spork
Stable-sorting by the first letter gives us:
apple peach straw spork
In an unstable algorithm, straw or spork may be interchanged, but in stable sort, they stay in the same relative positions (that is, since 'straw' appears before 'spork' in the input, it also appears before 'spork' in the output).
We could sort the list of words using this algorithm: stable sorting by column 5, then 4, then 3, then 2, then 1. In the end, it will be correctly sorted. Convince yourself of that. (by the way, that algorithm is called radix sort)
Now to answer your question, suppose we have a list of first and last names. We are asked to sort "by last name, then by first". We could first stable sort by the first name, then sort by the last name. After these sorts, the list is primarily sorted by the last name. However, where last names are the same, the first names are sorted.
You can't stack unstable sorts in the same fashion.
Sorting stability means that records with the same key retain their relative order before and after the sort.
So stability matters if, and only if, the problem you're solving requires retention of that relative order.
If you don't need stability, you can use a fast, memory-sipping algorithm from a library, like heapsort or quicksort, and forget about it.
If you need stability, it's more complicated. Stable algorithms have higher big-O CPU and/or memory usage than unstable algorithms. So when you have a large data set, you have to pick between beating up the CPU or the memory. If you're constrained on both CPU and memory, you have a problem. A good compromise stable algorithm is a binary tree sort; the Wikipedia article has a pathetically easy C++ implementation based on the STL.
You can make an unstable algorithm into a stable one by adding the original record number as the last-place key for each record.