views:

707

answers:

5

If one computer can only hold 1 million numbers, how to find out the median number from 100 million numbers?

+2  A: 

Do an external sort and then scan once for the median.

Hopefully, the real problem was "how do I do an external sort"? (If this is homework...I want to help in the right way. :-)

DigitalRoss
This is what I thought. :) But I'm not sure it's the correct answer, so I asked here.
Stephen Hsu
There has got to be a way to do this with the literal constraint that the device can only store 1 million numbers. Using the external sort seems like cheating. Now I'm gonna be up all night thinking about this.
JohnFx
Heh, I wondered that myself. It's a really good question.
DigitalRoss
+3  A: 

Reduce the problem to a more difficult one: sort the 100 million numbers using merge sort Then, take the 50 millionth element.

Pascal Cuoq
But the computer can hold 1 million numbers only, how can I find the 50 millionth one?
Stephen Hsu
On the tape (oh, right, this is no longer the 80s. I meant "on the disk"), at the 50 millionth position. You have storage for your 100M elements, right? Because if you don't (elements read from a stream) the exercice can be proved impossible by a counting argument.
Pascal Cuoq
+1  A: 

Using 101 computers and a sort-merge just like a database.

tster
Lol. This answer should feature as the best programmer joke!
Ashwin
I'll take it as one of my answers.:)
Stephen Hsu
A: 

Find the middle million numbers and then report the median of them. (Hmmm, now how to find those middle million numbers...)

Paul McGuire
A: 

I think I've got it! It isn't efficient (On^2), but logically it seems like it would work.

In pseudo-code:

for (locationCurrent=1;locationCurrent<100M,locationCurrent++)
{
   itemsToLeft=0;
   currentItem = readNumber(locationCurrent);

    for (locationCompare=1;locationCompare<100M,locationCompare++)
    {          
      if ((currentItem>=readNumber(LocationCompare)) &&  
          (locationCompare!=LocationCurrent))
      { itemsToLeft++;}

      /* go on to next current item if this one is past the median. */
      if (itemsToLeft>500000) break; 
    }

    if (itemsToLeft==500000) return currentItem
}
JohnFx