NOTE: This is a solution for Project Euler problem 14. If you still want to solve it yourself then don't read on.
The problem was to find the number below one million which, as starting number for the Collatz sequence, produces the longest such sequence. My initial code was as follows:
$r = @{}
for($i = 1; $i -lt 1000000; $i++) {
$n = 0
$a = $i
while ($a -gt 1) {
if ($r[$a]) {
$n += $r[$a]
break
}
if ($a % 2 -eq 0) {
$a /= 2
} else {
$a = $a * 3 + 1
}
$n++
}
$r[$i] = $n
}
$r.GetEnumerator() | sort Value | select -l 1 | %{$_.Key}
which tries to use a hashtable as cache for sub-sequences already encountered to save time. I was quite surprised when that script had a runtime of over eight minutes on my machine. Re-creating the same code in C#:
using System;
using System.Collections.Generic;
class Problem14
{
public static void Main()
{
var ht = new Dictionary<long, int>();
for (int i = 1; i < 1000000; i++)
{
int count = 0;
long j = i;
while (j > 1)
{
if (ht.ContainsKey(j))
{
count += ht[j];
break;
}
if (j % 2 == 0)
j /= 2;
else
j = 3 * j + 1;
count++;
}
ht[i] = count;
}
KeyValuePair<long, int> max = new KeyValuePair<long, int>();
foreach (var n in ht)
{
if (n.Value > max.Value)
max = n;
}
Console.WriteLine(max.Key);
}
}
had a runtime of just over one second. I knew that speed of execution wasn't a prime goal in Powershell. It's an adminstration language and for those tasks the ratio of PS code to cmdlets is likely very different from what I do here.
Still, I don't know what exactly causes the slowdown.
Suspecting the hash table, I replaced it for caching with an array. This resulted in an execution time around 200 ms in C# and about 32 minutes in Powershell. Code was as follows:
$r = ,0*1000000
for($i = 1; $i -lt 1000000; $i++) {
$n = 0
$a = $i
while ($a -gt 1) {
if ($r[$a]) {
$n += $r[$a]
break
}
if ($a % 2 -eq 0) {
$a /= 2
} else {
$a = $a * 3 + 1
}
$n++
}
if ($i -lt 1000000) {
$r[$i] = $n
}
}
$max = 0
for($i=1; $i -lt 1000000; $i++) {
if ($r[$i] > $r[$max]) {
$max = $i
}
}
$max
and
using System;
class Problem14
{
public static void Main()
{
var cache = new int[1000000];
for (int i = 1; i < 1000000; i++)
{
int count = 0;
long j = i;
while (j > 1)
{
if (j < 1000000 && cache[j] != 0)
{
count += cache[j];
break;
}
if (j % 2 == 0)
j /= 2;
else
j = 3 * j + 1;
count++;
}
cache[i] = count;
}
var max = 0;
for (int i = 1; i < cache.Length; i++)
{
if (cache[i] > cache[max])
max = i;
}
Console.WriteLine(max);
}
}
A completely cacheless variant was around 1.2 seconds in C#. Didn't try yet in Powershell.
Any ideas?