views:

113

answers:

4

I'm working on Project Euler #14 in C and have figured out the basic algorithm; however, it runs insufferably slow for large numbers, e.g. 2,000,000 as wanted; I presume because it has to generate the sequence over and over again, even though there should be a way to store known sequences (e.g., once we get to a 16, we know from previous experience that the next numbers are 8, 4, 2, then 1).

I'm not exactly sure how to do this with C's fixed-length array, but there must be a good way (that's amazingly efficient, I'm sure). Thanks in advance.

Here's what I currently have, if it helps.

#include <stdio.h>
#define UPTO 2000000

int collatzlen(int n);

int main(){
    int i, l=-1, li=-1, c=0;
    for(i=1; i<=UPTO; i++){
        if( (c=collatzlen(i)) > l) l=c, li=i;
    }
    printf("Greatest length:\t\t%7d\nGreatest starting point:\t%7d\n", l, li);
    return 1;
}

/* n != 0 */
int collatzlen(int n){
    int len = 0;
    while(n>1) n = (n%2==0 ? n/2 : 3*n+1), len+=1;
    return len;
}
A: 

You can use a fixed length array as a hash table and use a linked list for each array entry to save the values.

In this case if you use an array of length 2000000 the sequence length for number x could be stored for example in the linked list in array index (x mod 2000000).

Itay
+1  A: 

Given that this is essentially a throw-away program (i.e. once you've run it and got the answer, you're not going to be supporting it for years :), I would suggest having a global variable to hold the lengths of sequences already calculated:

int lengthfrom[UPTO] = {};

If your maximum size is a few million, then we're talking megabytes of memory, which should easily fit in RAM at once.

The above will initialise the array to zeros at startup. In your program - for each iteration, check whether the array contains zero. If it does - you'll have to keep going with the computation. If not - then you know that carrying on would go on for that many more iterations, so just add that to the number you've done so far and you're done. And then store the new result in the array, of course.

Don't be tempted to use a local variable for an array of this size: that will try to allocate it on the stack, which won't be big enough and will likely crash.

Also - remember that with this sequence the values go up as well as down, so you'll need to cope with that in your program (probably by having the array longer than UPTO values, and using an assert() to guard against indices greater than the size of the array).

psmears
The problem with this approach is that you cannot know the right UPTO value.
Itay
True. But given that the aim is just to find the right answer to a specific question (rather than to produce a program that succeeds on all inputs) it doesn't hurt to increase the size of the array and run it again. If that doesn't work (once the array starts to get close to the size of the machine's memory) then one of the more complex approaches (e.g. a hash table with buckets) can be implemented - but you may save time by trying the easy way first :)
psmears
+1  A: 

If I recall correctly, your problem isn't a slow algorithm: the algorithm you have now is fast enough for what PE asks you to do. The problem is overflow: you sometimes end up multiplying your number by 3 so many times that it will eventually exceed the maximum value that can be stored in a signed int. Use unsigned ints, and if that still doesn't work (but I'm pretty sure it does), use 64 bit ints (long long).

This should run very fast, but if you want to do it even faster, the other answers already addressed that.

IVlad
+3  A: 

Your original program needs 3.5 seconds on my machine. Is it insufferably slow for you?

My dirty and ugly version needs 0.3 seconds. It uses a global array to store the values already calculated. And use them in future calculations.

int collatzlen2(unsigned long n);
static unsigned long array[2000000 + 1];//to store those already calculated

int main()
{
    int i, l=-1, li=-1, c=0;
    int x;
    for(x = 0; x < 2000000 + 1; x++) {
        array[x] = -1;//use -1 to denote not-calculated yet
    }

    for(i=1; i<=UPTO; i++){
        if( (c=collatzlen2(i)) > l) l=c, li=i;
    }
    printf("Greatest length:\t\t%7d\nGreatest starting point:\t%7d\n", l, li);

    return 1;
}

int collatzlen2(unsigned long n){
    unsigned long len = 0;
    unsigned long m = n;
    while(n > 1){
        if(n > 2000000 || array[n] == -1){ // outside range or not-calculated yet
            n = (n%2 == 0 ? n/2 : 3*n+1);
            len+=1;
        }
        else{ // if already calculated, use the value
            len += array[n];
            n = 1; // to get out of the while-loop
        }
    }
    array[m] = len;
    return len;
}
yehnan
+1 just store a bunch of them, and if 3n+1 is larger than the maximum value you're storing, calculate it again. 2~3 extra lines of very simple code for an order of magnitude increase in speed!
BlueRaja - Danny Pflughoeft