If you have n (n > 2) lists, it is not true that you need to carry out (n-1) comparisons on every merging step. However, the implementation becomes more complex.
Suppose you have three lists, list[0..2]
, and assume for simplicity you are merging them and they are all still non-empty (i.e. you have not run to the end of any of the lists). Assume furthermore for simplicity that all elements are distinct, i.e. when you compare two elements they are never the same. You have then six possible "states" in which you can be, which correspond to the six permutations of the three lists in an increasing order of the first elements on the lists, i.e. if
list[0] = [5, 7, 11, 15]
list[1] = [3, 4, 20, 21]
list[2] = [9, 10, 12, 19]
then the corresponding permutation of the lists is [1, 0, 2], i.e. list[1]
has the least front element and list[2]
has the greatest front element.
When you now pop the next element (4) from list[1]
you know already that list[0].front
< list[2].front
based on the state [1, 0, 2] where you were. So now you need to perform either 1 or 2 comparisons:
if (list[1].front < list[0].front) // (A)
--> move to state [1, 0, 2], next to pop is list[1]
else if (list[1].front < list[2].front)
--> move to state[0, 1, 2], next to pop is list[0]
else
--> move state[0, 2, 1], next to pop is list[0]
Assuming some kind of uniformity, the probability that the comparison (A) returns true, i.e. that the next element on the list from which you removed the previous element is less than the least element on the other two lists, is 1/3, so you have on the average (1/3 x 1 + 2/3 x 2) = 5/3 comparisons instead of 2 (which would be n-1).
This is obviously worse by 2/3 comparisons per insert the normal mergesort which only needs 1 comparison per popped element.
We can get a better result by considering also partially ordered states. There are three distinct comparisons which can be made (list[0] -- list[1], list[1] -- list[2] and list[0] -- list[2]). If we allow for the known results (<, >) to be augmented with "don't know" (?), there are the following possible states:
0/1 1/2 0/2
< < < (total order) [0,1,2]
< ? < (0 is least but don't know if 1 < 2) [0,1,2] [0,2,1]
< ? ? (0 is < 1, but 2 can't be positioned) [2,0,1] [0,2,1] [0,1,2]
? ? ? (no information about ordering) (all six permutations)
and then all the variants regarding permutations and swapping < for > at different places in the matrix.
Now if were in a (<,<,<) state (assume [0,1,2]) and you read the next element from the list from which you popped the previous one, there are two cases: either (1) you get an element that is lower than the element on list[1], in which case you are back at state [0,1,2] in one comparison; or you get an element that is higher than the element on list[1]. In this case you can output list[1] next, and you have entered a (<,?,<) state: you know that list[1] has the least front element but don't know now if list[0] or list[2] is the next.
Now in a (<,?,<) state you read a new element from list[1], you can use 1 + (1/3 + 4/3) = 1 5/3 comparisons to find the actual permutation of all the lists and you get back to a (<,<,<) state. Thus this sequence of two pushes cost 2 5/3 comparisons, for an average of 1 5/6 = 11/6 per insert; however because of the possibility that you can insert two lowest elements in a sequence from the same list the average cost is even less, by the same argument as before it is (1/3 + 2/3 x 11/6) = 6/18 + 22/18 = 1 + 5/9, worse than the original mergesort by 5/9 comparisons per insert but slightly better than 2/3 above.
For completeness, here the algorithm (fragment shown):
state_1_lt_2: /* known list[1].front < list[2].front */
if (list[0].front < list[1].front):
merge_from(0)
goto state_1_lt_2 /* 1 insert 1 comp prob 1/3 */
else
merge_from(1)
if (list[0].front < list[1].front)
if (list[1].front < list[2].front)
merge_from(0)
goto state_1_lt_2 /* 2 inserts 3 comps prob 2/3*1/2*1/3 = 1/9 */
else if (list[0].front < list[2].front)
merge_from(0)
goto state_2_lt_1 /* 2 inserts 4 comps prob 2/3*1/2*2/3*1/2 = 1/9 */
else
merge_from(2)
goto state_0_lt_1 /* 2 inserts 4 comps prob 1/9 */
else if (list[2].front < list[1].front)
merge_from(2)
goto state_1_lt_0 /* 2 inserts 3 comps 2/3 x 1/2 x 1/3 = 1/9 */
else if (list[2].front < list[0].front)
merge_from(1)
goto state_2_lt_0 /* 2 inserts 4 comps prob 1/9 */
else
merge_from(1)
goto state_0_lt_2 /* 2 inserts 4 comps prob 1/9 */
This sums up to an expected number of comparisons per insert of
1/3 x 1 + 4/9 x (4/2) + 2/9 x (3/2) = 6/18 + 16/18 + 6/18 = 30/18 = 1 5/9.