views:

87

answers:

4

I am having difficulty deciding what the time complexity of Euclid's great common denominator algorithm is. Sample pseudo-code is:

function gcd(a, b)
    while b ≠ 0
       t := b
       b := a mod b
       a := t
    return a

The only decision I think of is, "It depends." It seems to depend on "a" and "b." But what would the big-O notation for this be?

My guess is that the time complexity is O(a % b). Is that correct or is there a better way to write that?

+3  A: 

There's a great look at this on the wikipedia article.

It even has a nice plot of complexity for value pairs.

It is not O(a%b).

It is known (see article) that it will never take more steps than five times the number of digits in the smaller number. So the max number of steps grows as the number of digits (ln b). The cost of each step also grows as the number of digits, so the complexity is bound by O(ln^2 b) where b is the smaller nubmer. That's an upper limit, and the actual time is usually less.

JoshD
What does `n` represent?
IVlad
@IVlad: Number of digits. I've clarified the answer, thank you.
JoshD
For OP's algorithm, using (big integer) divisions (and not substractions) it is in fact something more like O(n^2 log^2n).
Alexandre C.
@Alexandre C.: Keep in mind `n = ln b`. What is the regular complexity of modulus for big int? Is it O(log n log^2 log n)
JoshD
@JoshD: it is something like that, I think I missed a log n term, the final complexity (for the algorithm with divisions) is O(n^2 log^2 n log n) in this case.
Alexandre C.
@JoshD: I missed something: typical complexity for division with remainder for bigints is O(n log^2 n log n) or O(n log^2n) or something like that (I don't remember exactly), but definitely at least linear in the number of digits.
Alexandre C.
+2  A: 

See here.

In particular this part:

Lamé showed that the number of steps needed to arrive at the greatest common divisor for two numbers less than n is

alt text

So O(log min(a, b)) is a good upper bound.

IVlad
That is true for the number of steps, but it doesn't account for the complexity of each step itself, which scales with the number of digits (ln n).
JoshD
+1  A: 

The worst case of Euclid Algorithm is when the remainders are the biggest possible at each step, ie. for two consecutive terms of the Fibonacci sequence.

When n and m are the number of digits of a and b, assuming n >= m, the algorithm uses O(m) divisions.

Note that complexities are always given in terms of the sizes of inputs, in this case the number of digits.

Alexandre C.
+1  A: 

The trick to analyzing the time complexity of Euclid's algorithm is to follow what happens over two iterations:

a', b' := a % b, b % (a % b)

Now a and b will both decrease, instead of only one, which makes the analysis easier. You can divide it into cases:

  • Tiny A: 2*a <= b
  • Tiny B: 2*b <= a
  • Small A: 2*a > b but a < b
  • Small B: 2*b > a but b < a
  • Equal: a == b

Show that each case decreases a+b by a minimum amount:

  • Tiny A: b % (a % b) < a and 2*a <= b, so b is decreased by at least half, so a+b decreased by at least 25%
  • Tiny B: a % b < b and 2*b <= a, so a is decreased by at least half, so a+b decreased by at least 25%
  • Small A: b will become b-a, which is less than b/2, decreasing a+b by at least 25%.
  • Small B: a will become a-b, which is less than a/2, decreasing a+b by at least 25%.
  • Equal: a+b drops to 0, which is obviously decreasing a+b by at least 25%.

Therefore, by case analysis, every double-step decreases a+b by at least 25%. So the total number of steps (S) until we hit 0 must satisfy 1.25^S <= A+B. Now just work it:

(5/4)^S <= A+B
S <= lg[5/4](A+B)
S is O(lg[5/4](A+B))
S is O(lg(A+B))
S is O(lg(A*B)) //because A*B asymptotically greater than A+B
S is O(lg(A)+lg(B))
//Input size N is lg(A) + lg(B)
S is O(N)

Therefore, because the number of iterations is linear in the input size and the total running time is linear in the number of iterations (assuming mod takes constant time), the total running time is linear in the input size.

Strilanc