views:

806

answers:

5

If you have 5 distinct numbers, how many comparisons at most do you need to sort this using merge sort?

+1  A: 

According to Wikipedia: In the worst case, merge sort does an amount of comparisons equal to or slightly smaller than (n ⌈lg n⌉ - 2^⌈lg n⌉ + 1)

norheim.se
I read that, i just wanted to see number.so then can i say : 5*3 - 2*3 +1 = 10. I was also curious about such instance of the numbers.
As ⌈lg 5⌉ is 2, the answer is 5*2-2^2+1 = 7. This makes sense if you follow the algorithm as described in the article. If the initial sequence is 2,4,1,3,5, the comparisons will be, in order of appearance: (2,4) (2,1) (3,5) (1,3) (2,3) (4,3) (4,5)
norheim.se
How about 2,4,5,3,1 ?
Zed
BTW, lg(5) = 2.322 ...
Zed
the formula suggests the ceiling, not the floor. hence log5 = 3
Actually, the mistake I made was assuming that lg referred to the natural logarithm. ln(5)=1.609... But of course the base 2 logarithm makes more sense in this context.
norheim.se
+2  A: 

When merge-sorting two lists of length L1 and L2, I suppose the worst case number of comparisons is L1+L2-1.

  • Initially you have five 1-long lists.
  • You can merge two pairs of lists with 2 comparisons, resulting in lists of length 2,2 and 1.
  • Then you can merge a 2 and 1 long list with at most another 1+2-1 = 2 comparisons, yielding a 2 and 3 long list.
  • Finally you merge these lists with at most 2+3-1 = 4 comparisons.

So I guess the answer is 8.

This sequence of numbers results in the above: [2], [4], [1], [3], [5] -> [2,4], [1,3], [5] -> [2,4], [1,3,5] -> [1,2,3,4,5]

Edit:

Here is a naive Erlang implementation. Based on this, the number of comparisons is 5,6,7 or 8 for permutations of 1..5.

-module(mergesort).

-compile(export_all).


test() ->
  lists:sort([{sort(L),L} || L <- permutations()]).

sort([]) -> {0, []};
sort([_] = L) -> {0, L};
sort(L) -> 
  {L1, L2} = lists:split(length(L) div 2, L),
  {C1, SL1} = sort(L1), {C2, SL2} = sort(L2),
  {C3, RL} = merge(SL1, SL2, [], 0),
  {C1+C2+C3, RL}.

merge([], L2, Merged, Comps) -> {Comps, Merged ++ L2};
merge(L1, [], Merged, Comps) -> {Comps, Merged ++ L1};
merge([H1|T1], [H2|_] = L2, Merged, Comps) when H1 < H2 -> merge(T1, L2, Merged ++[H1], Comps + 1);
merge(L1, [H2|T2], Merged, Comps) -> merge(L1, T2, Merged ++[H2], Comps + 1).


permutations() ->
  L = lists:seq(1,5),
  [[A,B,C,D,E] || A <- L, B <- L, C <- L, D <- L, E <- L, A =/= B, A =/= C, A =/= D, A =/= E, B =/= C, B =/= D, B =/= E, C =/= D, C =/= E, D =/= E].
Zed
+3  A: 

What is stopping you from coding a merge sort, keeping a counter for the number of comparisons in it, and trying it out on all permutations of [0,1,2,3,4]?

MAK
I like your answer, i just dont have much time to code it.I looked at these sorting applets and some of them are just wrong, some of them are just pictures.
Merge sort doesn't really take as long to code as you are probably thinking. It is fairly short in Python (and I think even shorter in many functional languages), and a basic C/C++/Java solution shouldn't also be too long.
MAK
+1  A: 

http://www.sorting-algorithms.com/

Makach
Thanks.I ve seen that.
+3  A: 

I find the question interesting, so I decided to explore it thoroughly (with a little experimentation in Python).

I downloaded mergesort.py from here and modified it to add a cmp argument for a comparator function. Then:

import collections
import itertools
import mergesort
import sys

class CountingComparator(object):
  def __init__(self):
    self.count = 0
  def __call__(self, a, b):
    self.count += 1
    return cmp(a, b)

ms_histo = collections.defaultdict(int)

for perm in itertools.permutations(range(int(sys.argv[1]))):
  cc = CountingComparator()
  lperm = list(perm)
  mergesort.mergesort(lperm, cmp=cc)
  ms_histo[cc.count] += 1

for c in sorted(ms_histo):
  print "%d %2d" % (c, ms_histo[c])

The resulting simple histogram (starting with a length of 4, as I did for developing and debugging this) is:

4  8
5 16

For the problem as posted, with a length of 5 instead of 4, I get:

5  4
6 20
7 48
8 48

and with a length of 6 (and a wider format;-):

7    8
8   56
9  176
10 288
11 192

Finally, with a length of 7 (and even wider format;-):

 9   16
10  128
11  480
12 1216
13 1920
14 1280

Surely some perfectly regular combinatorial formula lurks here, but I'm finding it difficult to gauge what it might be, either analytically or by poring over the numbers. Anybody's got suggestions?

Alex Martelli
nice work. I really appreciate your curiosity and interest in the topic. By looking at the results, you can see that the number of comparisons are more than n and less than 2n. Wiki Suggests : In the worst case, merge sort does an amount of comparisons equal to or slightly smaller than (n ⌈lg n⌉ - 2⌈lg n⌉ + 1), which is between (n lg n - n + 1) and (n lg n + n + O(lg n)). [1]