views:

149

answers:

2

Bubble sort is O(n) at best, O(n^2) at worst, and its memory usage is O(1) . Merge sort is always O(n log n), but its memory usage is O(n).

Which algorithm we would use to implement a function that takes an array of integers and returns the max integer in the collection, assuming that the length of the array is less than 1000. What if the array length is greater than 1000?

+10  A: 

A sequential scan of a dataset, looking for a maximum, can be done in O(n) time with O(1) memory usage.

You just set the current maximum to the first element then run through all the other elements, setting the current maximum to the value if the value is greater. Pseudo-code:

max = list[first_index]
for index = first_index+1 to last_index:
    if list[index] > max:
        max = list[index]

The complexity doesn't change regardless of the number of elements in the list so it doesn't matter how many there are.


The running time will change however (since the algorithm is O(n) time) and, if it's important to find the maximum fast, there are a number of possibilities. These all hinge on doing work when the list changes, not every time you want the information, hence they're better for a list that is read more often than written, so the cost can be amortised.

Option 1 is to keep the list sorted so you can just grab the last element. This is probably overkill for just keeping a record of the maximum.

Option 2 is to recalculate the maximum (and number of elements holding it) when you insert into, or delete from, the list. Initially set max to 0 and maxcount to 0 for an empty list.

For an insert:

  • if maxcount is 0 (the list is empty), set max to this value and maxcount to 1.
  • otherwise, if this value is greater than max, set max to this value and maxcount to 1.
  • otherwise, if this value is equal to max, add 1 to maxcount.

For a deletion:

  • if this value is equal to max, subtract 1 from maxcount.
  • then, if maxcount is 0, rescan list to get new max and maxcount.

This way, at any time, you have the maximum value (the count is simply an extra "trick" to speed up the algorithm where there is more than one element holding the maximum value). I've used this once before in a data analysis application and it turned out to be much faster than re-sorting - I had to store both minimum and maximum in that case, but it's the same idea.

paxdiablo
The simple, "programming 101" way for keeping track of an extremum of a collection is a heap.
Svante
+1  A: 

Maximum value is always O(n) unless it's presorted. If presorted, it's less. Sorting before search is always worse than O(n)... so, generally, 1,000 elements will take 1,000 comparisons... just compare literally. If working with sorted structures, it's inexpensive. If not, it's expensive. Insertion with sorted structures are more expensive. ... this is why it's such a problem issue.

Pestilence