views:

551

answers:

5

Hi,

I tried to solve problem# 276 from Project Euler, but without success so far.

Consider the triangles with integer sides a, b and c with a ≤ b ≤ c. An integer sided triangle (a,b,c) is called primitive if gcd(a,b,c)=1. How many primitive integer sided triangles exist with a perimeter not exceeding 10 000 000?

The bottleneck in my code is the GCD function. It almost takes 90% of execution time for my sample input. I would love to hear hints, comments on how to improve the solution... My solution is written in C, but it is quit simple:

#include <stdio.h>
#include <math.h>

#define sides 3
// This is where my program spends most of its time
size_t bi_gcd (size_t a, size_t b);
size_t tri_gcd (size_t a, size_t b, size_t c);
size_t count_primitive_triangles (size_t maximum_perimeter);

int main()
{
 printf( "primitive_triangles = %lu \n",
            count_primitive_triangles(1000) );
}

size_t bi_gcd (size_t a, size_t b)
{
 size_t t;

 while(b != 0)
 {
  t = b;
  b = a % b;
  a = t;
 }

 return a;
}

size_t tri_gcd (size_t a, size_t b, size_t c)
{
 return bi_gcd(a, bi_gcd(b, c));
}

size_t count_primitive_triangles (size_t max_perimeter)
{
 size_t count = 0; // number of primitive triangles
 size_t a, b, c;   // sides of the triangle
 // the following are the bounds of each side
 size_t 
  // because b >= a && c >= b ==> max of a
  // is when a+b+c > 10,000,000
  // b == a (at least)
  // c == b (at least)
  // ==> a+b+c >= 10,000,000
  // ==> 3*a   >= 10,000,000
  // ==> a= 10,000,000/3
  a_limit = max_perimeter/sides,
  b_limit,
  c_limit;

 for (a = 1; a <= a_limit; ++a)
 {
  // because c >= b && a+b+c <= 10,000,000
  // ==> 2*b + a = 10,000,000
  // ==> 2*b = 10,000,000 - a
  // ==> b = (10,000,000 - a)/2
  for (b = a, b_limit = (max_perimeter-a)/2; b <= b_limit; ++b)
  {
   // The triangle inequality:
   // a+b > c (for a triangle to be valid!)
   // ==> c < a+b
   for (c = b, c_limit = a+b; c < c_limit; ++c)
   {
    if (tri_gcd(a, b, c) == 1)
     ++count;
   }
  }
 }

 return count;
}
+2  A: 

Brute force solutions tend to be really slow :)

A hint to reduce calculations:

  1. Precalculate all the primes in the range 2..107 and store them in a lookup table.
  2. In bi_gcd(a, b) check if a or b is a prime and return 1 otherwise calculate their gcd.
    Do the same in tri_gcd(a, b, c).

Edit: taking account of @interjay's comment:

If for example a is a prime we still have to check if b is a multiple of a,
something like this (b % a == 0). In that case a is the gcd.

Nick D
5 is prime, but gcd(5,10) is not 1. You would still need to check if the other number is a multiple of the prime.
interjay
@interjay, oops, you are right.
Nick D
`b%a==0` is much better than `(b/a)*a==b`. But then it's handled by the Euclidean step.
KennyTM
@KennyTM, thanks ;)
Nick D
A: 

In addition to Nick D's answer, which will stop you bothering to compute bi_gcd when one or both inputs is prime, I find myself wondering how many times you must be calling bi_gcd with the same (composite) numbers. GCD(12, 18) is always going to be 6, no matter how many times you calculate it. A mechanism to store results could improve things, I suspect.

AakashM
This technique is called "memoization", by the way.
Svante
+4  A: 

These are some things you can improve:

  • Instead of calculating tri_gcd(a,b,c) in the inner loop, calculate g1 = gcd(a,b) inside the second loop, and g2 = gcd(g1, c) in the inner loop.

  • When gcd(a,b)==1, you can avoid the inner loop and increment the count by max_c - min_c + 1, since you know the gcd will be 1 for any value of c.

  • Your inner loop seems to count too high, since it doesn't check that a+b+c <= 10000000.

Unfortunately, even with these changes and the ones mentioned in other answers so far, this will probably be too slow. I believe the real solution would not enumerate all possible triangles, but somehow count them in groups.

interjay
+1 for reducing the search space, very good idea.
Nick D
A: 

As a small improvement, you could insert some a special cases, e.g.

size_t bi_gcd (size_t a, size_t b) {
    if (a == 1 || b == 1) return 1;
    ...

size_t tri_gcd (size_t a, size_t b, size_t c) {
    if (a%2==0 && b%2==0 && c%2==0) return 2;
    // of course GCD(a,b,c) can be > 2,
    // but the exact GCD is not needed in this problem.
    ...

With these two if-s I can reduce the time taken from 1.2 seconds to 1.0 seconds (with gcc -O3).

KennyTM
+8  A: 

It's good that you're profiling before optimizing, but the fact that a lot of time in the gcd function doesn't necessarily mean you need to (or can) make it faster, rather that you're calling it too often. :-) Here's a hint — an algorithmic improvement that will improve the running time by orders of magnitude, instead of merely an improvement in the implementation.

You're currently counting just the primitive triangles individually. Instead, ask yourself this: can you efficiently count all triangles (not necessarily primitive) with perimeter a+b+c=n? (Running time O(n) will do – your current algorithm is Ω(n3).) Having done that, which triangles have you over-counted? For example, how many triangles are there with p dividing the sides? (Hint: a+b+c=n <=> (a/p) + (b/p) + (c/p) = n/p.) And so on.

Edit: After solving the problem and checking the thread on Project Euler, I found there are some other nice approaches to this problem, but the above approach is most common, and it works. For the first part, you can count it directly (some people have done it; it's certainly doable), or you may find this additional hint/trick helpful:

  • Suppose 1 ≤ a ≤ b ≤ c, and a+b+c = n. Let p = b+c-a = n-2a, let q = c+a-b = n-2b, and let r = a+b-c = n-2c. Then 1 ≤ r ≤ q ≤ p, and p+q+r = a+b+c = n. Also, p,q,r all have the same parity as n.
  • Conversely, for any 1 ≤ r ≤ q ≤ p with p+q+r = n with all three having the same parity (as n), let a = (q+r)/2, let b = (r+p)/2, c = (p+q)/2. Then 1 ≤ a ≤ b ≤ c and a+b+c=n. Further, these are integers and form a triangle (check).
  • So the number of integer triangles (a,b,c) with perimeter n is exactly the number of partitions of n into parts (p,q,r) of same parity. You may find this easier to count.

Additionally/alternatively, try directly relating this number T(n) to smaller numbers T(n-k), to get a recurrence relation.

(Of course, there are also some very simple formulae you can find if you Google, but what's the fun in that?)

ShreevatsaR