Can anyone point me to a good reference on External Memory Mergesort? I've read the wiki page but am having trouble understanding it exactly. An animation might help but I can't seem to find one.
Basically, I know that you have a certain number of blocks on disk, and you can fit a certain number of blocks in memory. Lets say you have 32 blocks on disk and 4 blocks in memory. In the first pass you read 4 blocks into memory at a time, sort them in memory, and write them back out do disk. So at this point you have 8 sorted runs of 4 blocks. How does the merging work? Since I have 4 blocks in memory (assume I have one more for output) I think I should be able to merge 4 of those 8 runs at a time, and then merge the next 4 runs. And then in the last pass I want to merge the whole thing. But don't you have to read each block from disk each time? So how does this not become a n^2 solution?