views:

489

answers:

5

The following iterative sequence is defined for the set of positive integers:

n ->n/2 (n is even) n ->3n + 1 (n is odd)

Using the rule above and starting with 13, we generate the following sequence:

13 40 20 10 5 16 8 4 2 1 It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1.

Which starting number, under one million, produces the longest chain?

NOTE: Once the chain starts the terms are allowed to go above one million.

I tried coding a solution to this in C using the bruteforce method. However, it seems that my program stalls when trying to calculate 113383. Please advise :)

#include <stdio.h>
#define LIMIT 1000000

int iteration(int value)
{
 if(value%2==0)
  return (value/2);
 else
  return (3*value+1);
}

int count_iterations(int value)
{
 int count=1;
 //printf("%d\n", value);
 while(value!=1)
 {
  value=iteration(value);
  //printf("%d\n", value);
  count++;
 }
 return count;
}

int main()
{
 int iteration_count=0, max=0;
 int i,count;


 for (i=1; i<LIMIT; i++)
 {
  printf("Current iteration : %d\n", i);
  iteration_count=count_iterations(i);
  if (iteration_count>max)
   {
   max=iteration_count;
   count=i;
   }

 }

 //iteration_count=count_iterations(113383); 
 printf("Count = %d\ni = %d\n",max,count);

}
+6  A: 

Notice that your brute force solution often computes the same subproblems over and over again. For example, if you start with 10, you get 5 16 8 4 2 1; but if you start with 20, you get 20 10 5 16 8 4 2 1. If you cache the value at 10 once it's computed, and then won't have to compute it all over again.

(This is known as dynamic programming.)

Jesse Beder
the proper term is memoization, http://en.wikipedia.org/wiki/Memoization
Nick D
I understand the concept but am not very sure how it translates into code. Perhaps a lookup table?
paradox
@Nick I disagree that "the proper term is". See the wiki article. Memoization is just *one* approach that can be utilized in DP.
pst
This is not the reason he's hanging, the reason is that `int` isn't big enough (see my answer with less votes below).
Motti
@pst, DP makes use of recursion and memoization techniques for finding optimal solutions.
Nick D
@Nick, I was taking a broader approach in my terminology. Yes, caching is known as memoization, but it all falls under the umbrella of dynamic programming, which is what he should study. (Of course, I completely missed the 32-bit int business, which was the actual problem.) This also raises the question of whether you should post an answer and then immediately go to sleep :)
Jesse Beder
A: 

As has been said, the simplest way is to get some memoization to avoid recomputing things that haven't been computed. You might be interested in knowing that there is no cycle if you being from a number under one million (no cycle has been discovered yet, and people have explored much bigger numbers).

To translate it in code, you can think the python way:

MEMOIZER = dict()

def memo(x, func):
  global MEMOIZER
  if x in MEMOIZER: return MEMOIZER[x]

  r = func(x)

  MEMOIZER[x] = r

  return r

Memoization is a very generic scheme.

For the Collatze conjecture, you might run in a bit of a pinch because numbers can really grow and therefore you might blow up the available memory.

This is traditionally handled using caching, you only cache the last n results (tailored to occupy a given amount of memory) and when you already have n items cached and wish to add a newer one, you discard the older one.

For this conjecture there might be another strategy available, though a bit harder to implement. The basic idea is that you have only ways to reach a given number x:

  • from 2*x
  • from (x-1)/3

Therefore if you cache the results of 2*x and (x-1)/3 there is no point in caching x any longer >> it'll never get called anymore (except if you wish to print the sequence at the end... but it's only once). I leave it to you to take advantage of this so that your cache does not grow too much :)

Matthieu M.
+1  A: 

Having just tested it in C#, it appears that 113383 is the first value where the 32-bit int type becomes too small to store every step in the chain.

Try using an unsigned long when handling those big numbers ;)

Thibault Falise
True, but note that in C `long` may very well be the same as `int` and `long long` should be used.
Motti
@Motti: uintmax_t seems a reasonable choice.
Jonathan Leffler
@Jonathan, I wasn't familiar with `uintmax_t`. I see it's part of C99 which must be the reason for this, I use C++ which doesn't define it.
Motti
@Motti: fair enough - but the C++ standard doesn't define `long long` either :)
Jonathan Leffler
+2  A: 

The reason you're stalling is because you pass through a number greater than 2^31-1 (aka INT_MAX) try using unsigned long long instead of int.

I recently blogged about this, note that in C the naive iterative method is more than fast enough. For dynamic languages you may need to optimize by memoizing in order to obey the one minute rule (but this is not the case here).


Oops I did it again (this time examining further possible optimizations using C++).

Motti
Do you mean "memoizing"?
Svante
I can't tell if @Motti is being serious.
Larry
@Larry, I was, apologies, English isn't my native language and I seem to have being misreading memoization (Chrome's spell checker didn't help).
Motti
Heh, I wasn't quite sure, wasn't trying to be snarky, doubly because it was "dynamic programming" but you had some other acronym for "DP". =)
Larry
A: 

I solved the problem some time ago and luckily still have my code. Do not read the code if you don't want a spoiler:

#include <stdio.h>

int lookup[1000000] = { 0 };

unsigned int NextNumber(unsigned int value) {
  if ((value % 2) == 0) value >>= 1;
  else value = (value * 3) + 1;
  return value;
}

int main() {
  int i = 0;
  int chainlength = 0;
  int longest = 0;
  int longestchain = 0;
  unsigned int value = 0;
  for (i = 1; i < 1000000; ++i) {
    chainlength = 0;
    value = i;
    while (value != 1) {
      ++chainlength;
      value = NextNumber(value);
      if (value >= 1000000) continue;
      if (lookup[value] != 0) {
        chainlength += lookup[value];
        break;
      }
    }

    lookup[i] = chainlength;

    if (longestchain < chainlength) {
      longest = i;
      longestchain = chainlength;
    }
  }
  printf("\n%d: %d\n", longest, longestchain);
}

time ./a.out

[don't be lazy, run it yourself]: [same here]

real    0m0.106s
user    0m0.094s
sys     0m0.012s
dbemerlin
value >>= 1 Is this shift left one bit and assign?
paradox
>>= 1 is divide by 2. It would be totally sufficient to just write /= 2 as the compiler would optimize himself, it's simple premature optimization on my part.
dbemerlin
I noticed a error in your code. Chainlength should start at 1 instead of 0.
paradox
Thanks for pointing it out. Anyways, as the question asks for the highest number, not the most steps the result will still be the same, just the number of steps might be off by one (and real programmers begin counting with 0 anyways ;-) ).
dbemerlin
lol. the proverbial "programmers can't count" debate :p
paradox