views:

106

answers:

4

Suppose I have a list of things (numbers, to keep things simple here) and I have a function I want to use to sort them by, using SortBy. For example, the following sorts a list of numbers by last digit:

SortBy[{301, 201}, Mod[#,10]&]

And notice how two of (ie, all of) those numbers have the same last digit. So it doesn't matter which order we return them in. In this case Mathematica returns them in the opposite order. How can I ensure that all ties are broken in favor of how the items were ordered in the original list?

(I know it's kind of trivial but I feel like this comes up from time to time so I thought it would be handy to get it on StackOverflow. I'll post whatever I come up with as an answer if no one beats me to it.)

Attempts at making this more searchable: sort with minimal disturbance, sort with least number of swaps, custom tie-breaking, sorting with costly swapping, stable sorting.

PS: Thanks to Nicholas for pointing out that this is called stable sorting. It was on the tip of my tongue! Here's another link: http://planetmath.org/encyclopedia/StableSortingAlgorithm.html

+1  A: 

This seems to work:

stableSortBy[list_, f_] := 
  SortBy[MapIndexed[List, list], {f@First[#], Last[#]}&][[All,1]]

But now I see rosettacode gives a much nicer way to do it:

stableSortBy[list_, f_] := list[[Ordering[f /@ list]]]

So Ordering is the key! It seems the Mathematica documentation makes no mention of this sometimes-important difference Sort and Ordering.

dreeves
This approach is called the "decorate-sort-undecorate" idiom in Lisp circles, and even if `GatherBy` may be the best approach to stable sorting in particular, it is a phenomenally useful trick in Mathematica.
Pillsy
I'd be curious as to how this stacks up against `GatherBy` in terms of speed, and it would depend on how lists are implemented internally. My issue is that `GatherBy` might only touch each list element once, while the `Ordering` solution would require at least twice: once for the sort and once for the re-ordering of the elements. For small lists, it may not matter. But, without actually doing it, I suspect that for longer lists `GatherBy` will give superior performance.
rcollyer
+2  A: 

Does GatherBy do what you want?

Flatten[GatherBy[{301, 201, 502, 501, 101}, Mod[#, 10] &]]
Mark Fisher
Oh, it probably does, thanks! I hadn't thought of using GatherBy. I'm inclined to go with the Ordering solution. If you're curious and want to compare the solutions so far with Timing tests I'll make yours the accepted answer. (Oh, one problem with your solution: you should do Flatten[..., 1] otherwise if the elements were actually lists then the Flatten would ruin them.)
dreeves
A: 

There is a variant of SortBy which breaks ties by using additional ordering functions:

SortBy[list,{f1, f2, ...}]

By counting ties you can thus obtain a stable sorting:

Module[{tie = 0}, 
 SortBy[{19, 301, 201, 502, 501, 101, 300}, {Mod[#, 10] &, (tie++) &}]]

yields

{300, 301, 201, 501, 101, 502, 19}
sakra
+2  A: 

After asking around, I was given a satisfying explanation:

Short answer: You want SortBy[list, {f}] to get a stable sort.

Long answer:

SortBy[list, f] sorts list in the order determined by applying f to each element of list, breaking ties using the canonical ordering explained under Sort. (This is the second documented "More Information" note in the documentation for SortBy.)

SortBy[list, {f, g}] breaks ties using the order determined by applying g to each element.

Note that SortBy[list, f] is the same as SortBy[list, {f, Identity}].

SortBy[list, {f}] does no tie breaking (and gives a stable sort), which is what you want:

In[13]:= SortBy[{19, 301, 201, 502, 501, 101, 300}, {Mod[#, 10] &}]

Out[13]= {300, 301, 201, 501, 101, 502, 19}

Finally, sakra's solution SortBy[list, {f, tie++ &}] is effectively equivalent to SortBy[list, {f}].

Andrew Moylan
Wow, thank you, Andrew! I read this and was like "you just asked around?? where do you work, Wolfram Research?" Then I clicked your name and saw that indeed you do. :) I'm continually amazed at the levels of expertise StackOverflow attracts. Thanks so much for being here!
dreeves