views:

436

answers:

6

I'm trying to figure out an algorithm to find the highest 2 numbers in a list of numbers.

The highest number can be found in n-1 stages, perhaps by doing the fist step of a bubble sort or something along those lines. To me it seems that finding the next highest number as well could be found in a total of 1.5n comparisons on average.

My professor set us homework to write an alogrithm that finds the highest 2 numbers in n + log(n) comparisons. Is this even possible? Any ideas, suggestions?

Edit: When I say n + log(n) I'm not referring to O(n + log n), but rather exactly n + log n

+1  A: 

How about this:

for each listOfNumbers as number
    if number > secondHighest
        if number > highest
            secondHighest = highest
            highest = number
        else
            secondHighest = number
nickf
this requires `O(2n)` comparisons, which is greater than `O(n + log(n))`.
Dominic Rodger
The above refers to the worst-case running time, if the numbers are stored in ascending order.
Dominic Rodger
O(2n) is the same as O(n) so that comment doesn't make sense.
Kinopiko
yep. the loop is single pass, and it'll be somewhere in between (1 + 2/n) comparisons and 2n, depending on the initial order. any better ideas?
nickf
@Kinopiko, sorry, being braindead. Ignore the `O` and hopefully my comment does make sense. (i.e. `2n` comparisons is more than `n + log(n)` comparisons)
Dominic Rodger
+1  A: 

Edit: Oops, missed a simple thing due to stupidity. This solution isn't correct, although I am keeping it here as it is still avg(n+log(n)). Thanks to ShreevatsaR for pointing out my stupidity. I did consider the tree search, but completly missed the idea of backtracking to find the second highest number in log(n).

Anyway, here follows my proof for why the inferior algorithm is no more than avg(n+log(n)). In real life it should still perform pretty well at least.

  • First compare against the second highest recorded number.
  • Only if that comparison succeeds, compare against the highest recorded number.

To prove that it is on average n+log n, we simply have to prove that the first comparison only succeeds log(n) times on average. And that is fairly simple to see or demonstrate.

  1. Assume P as the the actual position of the current second largest number in a sorted version of the list, and run the algorithm
  2. If P>2 then when a larger number is found, the new P will on average be no more than P/2.
  3. If P=2 then the first comparison can not succeed, as there is no number that is greater than the current second largest number.
  4. P can at most equal N
  5. From 2, 3 and 4 it should be trivial to see that the first comparison can not succeed more than log(N) times on average.
Marcus Andrén
Your first lines aren't correct: you don't need 2*n comparisons in the worst case, and in fact it can be done in exactly (n-1)+ceiling(log n)-1, comparisons, as the question asks. I'm not downvoting this because the rest of the answer is right, but really, "I can't think of a way" is not a proof. :P
ShreevatsaR
+3  A: 

Yes, it's possible to do it in no more than (n + log n). I really can't tell you how without giving away the answer, but let me try. :-)

Take the n numbers, compare them in pairs at a time. Take the ceil(n/2) "winners", and repeat, "like a binary tree". Questions: how many comparisons does it take to find the largest one? How many people does this "winner" win against? To whom might the second largest have lost? So how many comparisons does it now take to find the second largest number?

The answer turns out to be a total of n-1 + ceiling(log n) - 1 comparisons, where the log is to base 2. You can also prove using an adversarial argument that it is not possible to do better than this in the worst case.

ShreevatsaR
Thanks. So simple, yet clever. Now I'm going to try and program it recursively and efficiently in Java...
Dave
A: 

The answer posted by ShreevatsaR seems to be O(n log n).

The first pass (n operations) produces n/2 answers. By repeating, I guess you mean you'll do n/2 operations to produce n/4 answers. You'll go through the loop log n times. This is a lot like a merge sort except that merge sort always processes n nodes each time through. It also runs the loop log n times. And I don't see how this algorithm will keep track of the second higest number.

nickf has the right answer. worst case is when the list is sorted, it will do 2n comparisons - that is O(n).

btw, O(n + log n) is O(n) the Order notation refers to worst case asymptotic growth.

Eh? n/2 + n/4 + … = n-1. (Another way to see this: everyone except the winner loses exactly once.) The algorithm I gave takes exactly n+log(n)-2 comparisons (not just asymptotically). Note that the measure in question is the number of comparisons, not running time. (But n+log(n) is O(n), so running time is O(n) too.) I'm sorry my answer isn't clearer, but I don't want to completely give away the answer to a homework question.
ShreevatsaR
A: 

You can use counting sort, radix sort, bucket sort or other linear time algorithm to first sort the list in descending order. Then just get the first 2 elements of the sorted list. So this will take (n) + 2 = (n)

Note that this algorithms can sort in linear time because each one has its own assumptions.

Enrique
(1) Nothing is guaranteed about the given numbers, so linear-time sorting is not possible. (2) Notice that the measure here is the number of comparisons, not the number of operations (3) Notice that it's asking for *exactly* n+log(n), not O(n + log(n)) — which would be just O(n).
ShreevatsaR
A: 

Pseudocode (isn't this essentially n?)

int highestNum = 0
int secondHighest = highestNum

for(i = 0; i < list.length; i++)
{
if(list[i] >= highestNum)
{
secondHighest = highestNum
highestNum = list[i]
}
}
Cobalt
Test on this list: 0 3 2 1. The secondHighest will remain 0. (More generally, take any list where the highest number occurs before the second highest number, and your code will miss the second highest.)
ShreevatsaR