views:

1757

answers:

9

Hi, this is a program I wrote to calculate pythagorean triplets. When I run the program it prints each set of triplets twice beacuse of the if statement. is there any way I can tell the program to only print a new set of triplets once. thanks.

import math

def main():
    for x in range (1, 1000):
     for y in range (1, 1000):
      for z in range(1, 1000):
       if x*x == y*y + z*z:
        print y, z, x
        print '-'*50

if __name__ == '__main__':
    main()
+2  A: 

Yes, there is.

Okay, now you'll want to know why. Why not just constrain it so that z > y? Try

for z in range (y+1, 1000)
dysfunctor
+10  A: 

You should define x < y < z.

for x in range (1, 1000):
    for y in range (x + 1, 1000):
            for z in range(y + 1, 1000):

Another good optimization would be to only use x and y and calculate zsqr = x * x + y * y. If zsqr is a square number (or z = sqrt(zsqr) is a whole number), it is a triplet, else not. That way, you need only two loops instead of three (for your example, that's about 1000 times faster).

schnaader
That all makes sense, but check your variable names! You're getting your x's, y's and z's muddled. :-)
dysfunctor
Fixed this, thanks. It's now zsqr = x * x + y * y like it should be
schnaader
The problem actually definesthat x+y+z=1000 so you eliminate the third loop entirely with a simple z = 1000-x-y
annakata
You do not need 3 embedded loops: check out ΤΖΩΤΖΙΟΥ's algorithm or mine. If you want to find the first 10.000 Pythagore triplets, our algorithms will save you days of calculations.
MiniQuark
My answer was about the OP's original post, not about Project Euler #9 (he first didn't answer if it is about this problem). In this case you can also use 2 loops and simply exit if x+y+z=1000. Any further optimizations won't mean anything, as the runtime is under a second.
schnaader
+1  A: 

I wrote that program in Ruby and it similar to the python implementation. The important line is:

if x*x == y*y + z*z && gcd(y,z) == 1:

Then you have to implement a method that return the greatest common divisor (gcd) of two given numbers. A very simple example in Ruby again:

def gcd(a, b)
    while b != 0
      t = b
      b = a%b
      a = t
    end
    return a
end

The full Ruby methon to find the triplets would be:

def find_triple(upper_boundary)

  (5..upper_boundary).each {|c|
    (4..c-1).each {|b|
      (3..b-1).each {|a|
        if (a*a + b*b == c*c && gcd(a,b) == 1)
          puts "#{a} \t #{b} \t #{c}"
        end
      }
    }
  }
end
Christian Stade-Schuldt
Just to clarify the GCD step is only necessary if you want to print primitive triplets. If you want any triplet then you shouldn't have this.
DasBoot
This algorithm uses 3 embedded loops, although only 2 are required (see ΤΖΩΤΖΙΟΥ's algorithm or mine). This means that if you search the first 10.000 Pythagore triplets, your algorithm will be 10.000 times slower than a "2-loops" algorithm.
MiniQuark
Be careful with this reasoning. The algorithm that I produced above uses a three-loop construct but is approximately three times faster than the two-loop construct used by ΤΖΩΤΖΙΟΥ.
Jason
@Jason: be careful with comparisons, especially when comparing apples and oranges. In addition, MiniQuark's remark was generic and vague enough to be non-refutable (although not necessarily applicable), while yours lacks the numbers and a common runtime environment.
ΤΖΩΤΖΙΟΥ
@ΤΖΩΤΖΙΟΥ: I translated your code to C# and tested them apples to apples.
Jason
+2  A: 
def pyth_triplets(n=1000):
    for x in xrange(1, n):
        x2= x*x # time saver
        for y in xrange(x+1, n): # y > x
            z2= x2 + y*y
            zs= int(z2**.5)
            if zs*zs == z2:
                yield x, y, zs

>>> print list(pyth_triplets(20))
[(3, 4, 5), (5, 12, 13), (6, 8, 10), (8, 15, 17), (9, 12, 15), (12, 16, 20)]
ΤΖΩΤΖΙΟΥ
+5  A: 

Algorithms can be tuned for speed, memory usage, simplicity, and other things.

Here is a pythagore_triplets algorithm tuned for speed, at the cost of memory usage and simplicity. If all you want is speed, this could be the way to go.

Calculation of list(pythagore_triplets(10000)) takes 40 seconds on my computer, versus 63 seconds for ΤΖΩΤΖΙΟΥ's algorithm, and possibly days of calculation for Tafkas's algorithm (and all other algorithms which use 3 embedded loops instead of just 2).

def pythagore_triplets(n=1000):
   maxn=int(n*(2**0.5))+1 # max int whose square may be the sum of two squares
   squares=[x*x for x in xrange(maxn+1)] # calculate all the squares once
   reverse_squares=dict([(squares[i],i) for i in xrange(maxn+1)]) # x*x=>x
   for x in xrange(1,n):
     x2 = squares[x]
     for y in xrange(x,n+1):
       y2 = squares[y]
       z = reverse_squares.get(x2+y2)
       if z != None:
         yield x,y,z

>>> print list(pythagore_triplets(20))
[(3, 4, 5), (5, 12, 13), (6, 8, 10), (8, 15, 17), (9, 12, 15), (12, 16, 20)]

Note that if you are going to calculate the first billion triplets, then this algorithm will crash before it even starts, because of an out of memory error. So ΤΖΩΤΖΙΟΥ's algorithm is probably a safer choice for high values of n.

BTW, here is Tafkas's algorithm, translated into python for the purpose of my performance tests. Its flaw is to require 3 loops instead of 2.

def gcd(a, b):
  while b != 0:
    t = b
    b = a%b
    a = t
  return a

def find_triple(upper_boundary=1000):
  for c in xrange(5,upper_boundary+1):
    for b in xrange(4,c):
      for a in xrange(3,b):
        if (a*a + b*b == c*c and gcd(a,b) == 1):
          yield a,b,c
MiniQuark
Square root cost in the Python virtual machine is overrated, but I like your answer.
ΤΖΩΤΖΙΟΥ
BTW, I found a bug in the first version of my algorithm: I needed reverse_squares up to n**2+(n-1)**2, but I only had reverse_squares up to n**2. So I rounded up to 2*(n**2): the bug is fixed.
MiniQuark
Please check GCD for fastest alogirthm in http://en.wikipedia.org/wiki/Binary_GCD_algorithm
lakshmanaraj
Pls reread V4 carefully. The outer while tests z, which is ALSO being incremented in the inner while. So the inner while is speeding up the outer one. That's why it's quadratic, not cubic as you seem to think. And V4 doesn't run out of memory if the dictionary gets big.
joel.neely
I agree with you that the differences in performance between Python and Java are interesting. I suspect the fact that Java handles boxing/unboxing differently for the "hash of square roots" map may be a factor.
joel.neely
+20  A: 

Pythagorean Triples make a good example for claiming "for loops considered harmful", because for loops seduce us into thinking about counting, often the most irrelevant part of a task.

(I'm going to stick with pseudo-code to avoid language biases, and to keep the pseudo-code streamlined, I'll not optimize away multiple calculations of e.g. x * x and y * y.)

Version 1:

for x in 1..N {
    for y in 1..N {
        for z in 1..N {
            if x * x + y * y == z * z then {
                // use x, y, z
            }
        }
    }
}

is the worst solution. It generates duplicates, and traverses parts of the space that aren't useful (e.g. whenever z < y). Its time complexity is cubic on N.

Version 2, the first improvement, comes from requiring x < y < z to hold, as in:

for x in 1..N {
    for y in x+1..N {
        for z in y+1..N {
            if x * x + y * y == z * z then {
                // use x, y, z
            }
        }
    }
}

which reduces run time and eliminates duplicated solutions. However, it is still cubic on N; the improvement is just a reduction of the co-efficient of N-cubed.

It is pointless to continue examining increasing values of z after z * z < x * x + y * y no longer holds. That fact motivates Version 3, the first step away from brute-force iteration over z:

for x in 1..N {
    for y in x+1..N {
        z = y + 1
        while z * z < x * x + y * y {
            z = z + 1
        }
        if z * z == x * x + y * y and z <= N then {
            // use x, y, z
        }
    }
}

For N of 1000, this is about 5 times faster than Version 2, but it is still cubic on N.

The next insight is that x and y are the only independent variables; z depends on their values, and the last z value considered for the previous value of y is a good starting search value for the next value of y. That leads to Version 4:

for x in 1..N {
    y = x+1
    z = y+1
    while z <= N {
        while z * z < x * x + y * y {
            z = z + 1
        }
        if z * z == x * x + y * y and z <= N then {
            // use x, y, z
        }
        y = y + 1
    }
}

which allows y and z to "sweep" the values above x only once. Not only is it over 100 times faster for N of 1000, it is quadratic on N, so the speedup increases as N grows.

I've encountered this kind of improvement often enough to be mistrustful of "counting loops" for any but the most trivial uses (e.g. traversing an array).

Update: Apparently I should have pointed out a few things about V4 that are easy to overlook.

  1. Both of the while loops are controlled by the value of z (one directly, the other indirectly through the square of z). The inner while is actually speeding up the outer while, rather than being orthogonal to it. It's important to look at what the loops are doing, not merely to count how many loops there are.

  2. All of the calculations in V4 are strictly integer arithmetic. Conversion to/from floating-point, as well as floating-point calculations, are costly by comparison.

  3. V4 runs in constant memory, requiring only three integer variables. There are no arrays or hash tables to allocate and initialize (and, potentially, to cause an out-of-memory error).

  4. The original question allowed all of x, y, and x to vary over the same range. V1..V4 followed that pattern.

Below is a not-very-scientific set of timings (using Java under Eclipse on my older laptop with other stuff running...), where the "use x, y, z" was implemented by instantiating a Triple object with the three values and putting it in an ArrayList. (For these runs, N was set to 10,000, which produced 12,471 triples in each case.)

Version 4:           46 sec.
using square root:  134 sec.
array and map:      400 sec.

The "array and map" algorithm is essentially:

squares = array of i*i for i in 1 .. N
roots = map of i*i -> i for i in 1 .. N
for x in 1 .. N
    for y in x+1 .. N
        z = roots[squares[x] + squares[y]]
        if z exists use x, y, z

The "using square root" algorithm is essentially:

for x in 1 .. N
    for y in x+1 .. N
        z = (int) sqrt(x * x + y * y)
        if z * z == x * x + y * y then use x, y, z

The actual code for V4 is:

public Collection<Triple> byBetterWhileLoop() {
    Collection<Triple> result = new ArrayList<Triple>(limit);
    for (int x = 1; x < limit; ++x) {
        int xx = x * x;
        int y = x + 1;
        int z = y + 1;
        while (z <= limit) {
            int zz = xx + y * y;
            while (z * z < zz) {++z;}
            if (z * z == zz && z <= limit) {
                result.add(new Triple(x, y, z));
            }
            ++y;
        }
    }
    return result;
}

Note that x * x is calculated in the outer loop (although I didn't bother to cache z * z); similar optimizations are done in the other variations.

I'll be glad to provide the Java source code on request for the other variations I timed, in case I've mis-implemented anything.

joel.neely
In your version 4 you *still* have 3 embedded loops, so calculating the first 10.000 Pythagore triplets will literally take days. You may want to check out ΤΖΩΤΖΙΟΥ's algorithm or mine for a 2 loops solution.
MiniQuark
You don't understand how V4 differs from V1-V3. It took 49 seconds on my laptop to calculate the first 12471 triples. Both "while" loops are testing on z's value.
joel.neely
@joel: I must admit that I have trouble understanding your algorithm, but I just ran it and it took 61 seconds to calculate pythagore_triplets_v4(10000), so I believe you: it looks like it's quadratic. But still your algorithm is 50% slower than my "array and map" algorithm.
MiniQuark
MiniQuark
@joel: I finally understand this algorithm. Took me some time. ;-) Nice work.
MiniQuark
A: 

Version 5 to Joel Neely.

Since X can be max of 'N-2' and Y can be max of 'N-1' for range of 1..N. Since Z max is N and Y max is N-1, X can be max of Sqrt ( N * N - (N-1) * (N-1) ) = Sqrt ( 2 * N - 1 ) and can start from 3.

MaxX = ( 2 * N - 1 ) ** 0.5

for x in 3..MaxX {
  y = x+1
  z = y+1
  m = x*x + y*y
  k = z * z
  while z <= N {
     while k < m {
        z = z + 1
        k = k + (2*z) - 1
    }
    if k == m and z <= N then {
        // use x, y, z
    }
    y = y + 1
    m = m + (2 * y) - 1
  }
 }
lakshmanaraj
This algorithm uses 3 embedded loops, although only 2 are required (see ΤΖΩΤΖΙΟΥ's algorithm or mine). This means that if you search the first 10.000 Pythagore triplets, your algorithm will be roughly 10.000 times slower than a "2-loops" algorithm.
MiniQuark
I just optimized joel Neely code further. I will go through your code and i will optimize if required.
lakshmanaraj
I also checked roughly no of loops and even though it appears to be 3 way loops, total no of statements exectuted is actually less than 2 - looped statement. Please check by putting no of times k is changed or m is changed.
lakshmanaraj
@lakshmanaraj: MiniQuark didn't understand the relationship between the two while loops (they both are driven by z).
joel.neely
I've posted the actual code for V4; as you can see, it moves the squaring of x out of the nested loops.
joel.neely
+4  A: 

The previously listed algorithms for generating Pythagorean triplets are all modifications of the naive approach derived from the basic relationship a^2 + b^2 = c^2 where (a, b, c) is a triplet of positive integers. It turns out that Pythagorean triplets satisfy some fairly remarkable relationships that can be used to generate all Pythagorean triplets.

Euclid discovered the first such relationship. He determined that for every Pythagorean triple (a, b, c), possibly after a reordering of a and b there are relatively prime positive integers m and n with m > n, at least one of which is even, and a positive integer k such that

a = k (2mn)
b = k (m^2 - n^2)
c = k (m^2 + n^2)

Then to generate Pythagorean triplets, generate relatively prime positive integers m and n of differing parity, and a positive integer k and apply the above formula.

struct PythagoreanTriple {
    public int a { get; private set; }
    public int b { get; private set; }
    public int c { get; private set; }

    public PythagoreanTriple(int a, int b, int c) : this() {
        this.a = a < b ? a : b;
        this.b = b < a ? a : b;
        this.c = c;
    }

    public override string ToString() {
        return String.Format("a = {0}, b = {1}, c = {2}", a, b, c);
    }

    public static IEnumerable<PythagoreanTriple> GenerateTriples(int max) {
        var triples = new List<PythagoreanTriple>();
        for (int m = 1; m <= max / 2; m++) {
            for (int n = 1 + (m % 2); n < m; n += 2) {
                if (m.IsRelativelyPrimeTo(n)) {
                    for (int k = 1; k <= max / (m * m + n * n); k++) {
                        triples.Add(EuclidTriple(m, n, k));
                    }
                }
            }
        }

        return triples;
    }

    private static PythagoreanTriple EuclidTriple(int m, int n, int k) {
        int msquared = m * m;
        int nsquared = n * n;
        return new PythagoreanTriple(k * 2 * m * n, k * (msquared - nsquared), k * (msquared + nsquared));
    }
}

public static class IntegerExtensions {
    private static int GreatestCommonDivisor(int m, int n) {
        return (n == 0 ? m : GreatestCommonDivisor(n, m % n));
    }

    public static bool IsRelativelyPrimeTo(this int m, int n) {
        return GreatestCommonDivisor(m, n) == 1;
    }
}

class Program {
    static void Main(string[] args) {
        PythagoreanTriple.GenerateTriples(1000).ToList().ForEach(t => Console.WriteLine(t));            
    }
}

The Wikipedia article on Formulas for generating Pythagorean triples contains other such formulae.

Jason
The Ternary Tree method is very good (see the Wikipedia article).
starblue
A: 

Just checking, but I've been using the following code to make pythagorean triples. It's very fast (and I've tried some of the examples here, though I kind of learned them and wrote my own and came back and checked here (2 years ago)). I think this code correctly finds all pythagorean triples up to (name your limit) and fairly quickly too. I used C++ to make it.

ullong is unsigned long long and I created a couple of functions to square and root my root function basically said if square root of given number (after making it whole number (integral)) squared not equal number give then return -1 because it is not rootable. _square and _root do as expected as of description above, I know of another way to optimize it but I haven't done nor tested that yet.

generate(vector<Triple>& triplist, ullong limit) {
cout<<"Please wait as triples are being generated."<<endl;
register ullong a, b, c;
register Triple trip;
time_t timer = time(0);

for(a = 1; a <= limit; ++a) {
    for(b = a + 1; b <= limit; ++b) {
        c = _root(_square(a) + _square(b));

        if(c != -1 && c <= limit) {
            trip.a = a; trip.b = b; trip.c = c;

            triplist.push_back(trip);

        } else if(c > limit)
            break;
    }
}

timer = time(0) - timer;
cout<<"Generated "<<triplist.size()<<" in "<<timer<<" seconds."<<endl;
cin.get();
cin.get();

}

Let me know what you all think. It generates all primitive and non-primitive triples according to the teacher I turned it in for. (she tested it up to 100 if I remember correctly).

The results from the v4 supplied by a previous coder here are

Below is a not-very-scientific set of timings (using Java under Eclipse on my older laptop with other stuff running...), where the "use x, y, z" was implemented by instantiating a Triple object with the three values and putting it in an ArrayList. (For these runs, N was set to 10,000, which produced 12,471 triples in each case.)

Version 4: 46 sec. using square root: 134 sec. array and map: 400 sec.

The results from mine is How many triples to generate: 10000

Please wait as triples are being generated. Generated 12471 in 2 seconds.

That is before I even start optimizing via the compiler. (I remember previously getting 10000 down to 0 seconds with tons of special options and stuff). My code also generates all the triples with 100,000 as the limit of how high side1,2,hyp can go in 3.2 minutes (I think the 1,000,000 limit takes an hour).

I modified the code a bit and got the 10,000 limit down to 1 second (no optimizations). On top of that, with careful thinking, mine could be broken down into chunks and threaded upon given ranges (for example 100,000 divide into 4 equal chunks for 3 cpu's (1 extra to hopefully consume cpu time just in case) with ranges 1 to 25,000 (start at 1 and limit it to 25,000), 25,000 to 50,000 , 50,000 to 75,000, and 75,000 to end. I may do that and see if it speeds it up any (I will have threads premade and not include them in the actual amount of time to execute the triple function. I'd need a more precise timer and a way to concatenate the vectors. I think that if 1 3.4 GHZ cpu with 8 gb ram at it's disposal can do 10,000 as lim in 1 second then 3 cpus should do that in 1/3 a second (and I round to higher second as is atm).

lmartin92