views:

131

answers:

4

If I have an algorithm which is comprised of (let's say) three sub-algorithms, all with different O() characteristics, e.g.:

  • algorithm A: O(n)
  • algorithm B: O(log(n))
  • algorithm C: O(n log(n))

How do I theoretically estimate the O() for the entire algorithm? I.e. not clocking it or performing any other practical measurement.

Is there a well known formula or procedure?

+1  A: 

Would it make sense to take the worst performing section as the overall performance?

Rich Bradshaw
+3  A: 

The "total" complexity is the worst case of all sub algorithms. In your example: O(n*log(n))

Definition of O(f(n)) is that starting from some number (some n0), there is a constant, 'Const', where the total number of actions on input of size n is less than Const*f(n).

Therefore, if you have group of sub algorithms, the complexity will always be less then Sigma of all consts (Const1 + Const2 + ...) multiplied by the worst complexity function (e.g., from "n", "n*logn" and "n^2" it'll be n^2). By complexity definition it's the worst complexity in the group.

Example:

  1. Let's assume we're sorting an array (n*logn)
  2. Finding the item with key higher then x (logn)

Assume Const1 of the sort is 5 (meaning we can sort list of n items in less than 5*n*logn actions) and the Const2 is 3 (meaning we can find the element in less than 3*logn actions).

In this case, it's clear that the total number of action of both algorithms is less than (5+3)*n*logn actions, and therefore it O(n*logn)

Elisha
It would depend on whether the algorithms are run the same number of times relative to each other, or if there's a multiplicative effect, in which case you have to multiply them together.
Joe
@Joe, you're right, this analysis is relevant to sequence run and not to nested runs, I hope that's question :)The result remains the same if some of the algorithms run more than once as long as the number of times is constant and does not depend on n.
Elisha
+8  A: 

Is there a well known formula or procedure?

Since that’s basic maths, yes. But it’s even simpler in your case: since all of these give an upper bound, you can simply take the maximum of all the bounds to get the total upper bound.

– Provided that all of these algorithms are chained (rather than nested). If the algorithms are nested, you need to multiply their upper bounds (in the simplest case).

To illustrate, let’s say that you have a container data structure which has a cost of O(log n) for lookup of a single element. You also have an algorithm that needs such a lookup, but has running time O(n) itself, assuming constant costs for lookup, and using a single loop over the input, with a constant number of lookups per loop.

Now, if you want to use the previously mentioned container data structure with this algorithm, its lookup runtime obviously replaces the (assumed) constant-time lookup. So we’ve got the same loop, but each of its iterations now takes O(log n) instead of constant-time O(1), so the overall runtime becomes O(n log n).

Konrad Rudolph
How would one multiply for example O(n) with O(n log(n))?
sharkin
@R.A.: by multiplying the functions specifying the bounds, i.e. the result would be O(n^2 log n).
Konrad Rudolph
A: 

One of the assumptions with O(n) algorithm efficiency estimations is that our actual running time approaches the theoretical value. Thus, we shouldn't get too wrapped around the axle trying to figure out small variances. An O(n) algorithm may finish in O(n/2) time if we're doing a simple iterative search through an unsorted dataset (on average we'll find the value by the halfway point) for instance, but it is still an O(n) algorithm.

If you have a function that has to complete three subprocesses on a dataset before it can exit, then the slowest subprocess's running time on that dataset will be the longest pole in the tent, so to speak. In the specific example given, the function ('algorithm') runs in O(n) time, and we don't worry if it's O(n + n*log(n) + log(n)); the difference between that sum and O(n) is at most a factor of two. What we care about is the relative magnitude, i.e., whether running time is log(n), or (n) or (n^2) or (n^3) ad infinitum. What we care about is a factor of 10, or 100, or 1000.

John Clifford