The question is: In a k-way merge, how many merge operation will we perform. For example: 2-way merge:2 nodes 1 merge; 3 nodes 2 merge; 4 nodes 3 merge. So we get M(n)=n-1.
What the the M(n) when k is arbitrary?
The question is: In a k-way merge, how many merge operation will we perform. For example: 2-way merge:2 nodes 1 merge; 3 nodes 2 merge; 4 nodes 3 merge. So we get M(n)=n-1.
What the the M(n) when k is arbitrary?
2-way merges are most efficient when merging equal-sized blocks, so the most efficient k-way merge based on 2-way merges is to first merge block 1 with block 2, block 3 with block 4, and so on, then merge the first two resulting blocks, and so on. This is basically how mergesort works, and results in O(kn log k) time, assuming each of the k blocks contains n items. But it's only perfectly efficient if all blocks have exactly n items, and k is a power of 2, so...
Instead of performing k separate merge passes, you can use a single pass that uses a heap containing the first item of each block (i.e. k items in total):
If there are a total of kn items, this always takes O(kn log k) time regardless of how they are distributed amongst blocks, and regardless of whether k is a power of 2. Your heap needs to contain (item, block_index)
pairs so that you can identify which block each item comes from.
yes, The heap way may be more effective. But what's the answer to the orginal question? I found there may be no answer about that since it is maybe not a full k-way tree, so 4-way could regress to 3-way, 2-way.
OK, to answer the original question as stated:
To merge k blocks using a sequence of 2-way merges always requires exactly k - 1 merges, since regardless of what pair of blocks you choose to merge at any point in time, merging them reduces the total number of blocks by 1.
As I said in my original answer, which pairs of blocks you choose to merge does impact the overall time complexity -- it's better to merge similar-sized blocks -- but it doesn't affect the number of 2-way merge operations.
thanks hacker. In your original solution, it seems you use a heapsort to merge the blocks?
And I know if the each block doesn't contain the same number items, the merge is not effective. But Is there a formula to calculate that? just like 2-way which is (n-1)