If you have 5 distinct numbers, how many comparisons at most do you need to sort this using merge sort?
According to Wikipedia: In the worst case, merge sort does an amount of comparisons equal to or slightly smaller than (n ⌈lg n⌉ - 2^⌈lg n⌉ + 1)
When merge-sorting two lists of length L1 and L2, I suppose the worst case number of comparisons is L1+L2-1.
- Initially you have five 1-long lists.
- You can merge two pairs of lists with 2 comparisons, resulting in lists of length 2,2 and 1.
- Then you can merge a 2 and 1 long list with at most another 1+2-1 = 2 comparisons, yielding a 2 and 3 long list.
- Finally you merge these lists with at most 2+3-1 = 4 comparisons.
So I guess the answer is 8.
This sequence of numbers results in the above: [2], [4], [1], [3], [5] -> [2,4], [1,3], [5] -> [2,4], [1,3,5] -> [1,2,3,4,5]
Edit:
Here is a naive Erlang implementation. Based on this, the number of comparisons is 5,6,7 or 8 for permutations of 1..5.
-module(mergesort).
-compile(export_all).
test() ->
lists:sort([{sort(L),L} || L <- permutations()]).
sort([]) -> {0, []};
sort([_] = L) -> {0, L};
sort(L) ->
{L1, L2} = lists:split(length(L) div 2, L),
{C1, SL1} = sort(L1), {C2, SL2} = sort(L2),
{C3, RL} = merge(SL1, SL2, [], 0),
{C1+C2+C3, RL}.
merge([], L2, Merged, Comps) -> {Comps, Merged ++ L2};
merge(L1, [], Merged, Comps) -> {Comps, Merged ++ L1};
merge([H1|T1], [H2|_] = L2, Merged, Comps) when H1 < H2 -> merge(T1, L2, Merged ++[H1], Comps + 1);
merge(L1, [H2|T2], Merged, Comps) -> merge(L1, T2, Merged ++[H2], Comps + 1).
permutations() ->
L = lists:seq(1,5),
[[A,B,C,D,E] || A <- L, B <- L, C <- L, D <- L, E <- L, A =/= B, A =/= C, A =/= D, A =/= E, B =/= C, B =/= D, B =/= E, C =/= D, C =/= E, D =/= E].
What is stopping you from coding a merge sort, keeping a counter for the number of comparisons in it, and trying it out on all permutations of [0,1,2,3,4]?
I find the question interesting, so I decided to explore it thoroughly (with a little experimentation in Python).
I downloaded mergesort.py
from here and modified it to add a cmp
argument for a comparator function. Then:
import collections
import itertools
import mergesort
import sys
class CountingComparator(object):
def __init__(self):
self.count = 0
def __call__(self, a, b):
self.count += 1
return cmp(a, b)
ms_histo = collections.defaultdict(int)
for perm in itertools.permutations(range(int(sys.argv[1]))):
cc = CountingComparator()
lperm = list(perm)
mergesort.mergesort(lperm, cmp=cc)
ms_histo[cc.count] += 1
for c in sorted(ms_histo):
print "%d %2d" % (c, ms_histo[c])
The resulting simple histogram (starting with a length of 4, as I did for developing and debugging this) is:
4 8
5 16
For the problem as posted, with a length of 5 instead of 4, I get:
5 4
6 20
7 48
8 48
and with a length of 6 (and a wider format;-):
7 8
8 56
9 176
10 288
11 192
Finally, with a length of 7 (and even wider format;-):
9 16
10 128
11 480
12 1216
13 1920
14 1280
Surely some perfectly regular combinatorial formula lurks here, but I'm finding it difficult to gauge what it might be, either analytically or by poring over the numbers. Anybody's got suggestions?