views:

678

answers:

2

I'm reading Paul Graham's A Plan for Spam and want to understand it better but my LISP is really rusty. He has a snippet of code that calculates probability as such:

(let ((g (* 2 (or (gethash word good) 0)))
      (b (or (gethash word bad) 0)))
   (unless (< (+ g b) 5)
     (max .01
          (min .99 (float (/ (min 1 (/ b nbad))
                             (+ (min 1 (/ g ngood))   
                                (min 1 (/ b nbad)))))))))

My question is twofold: (1) is there a web resource that will convert LISP to a different language? (my preference would be a C based language) or failing that (2) can someone rewrite that code snippet in C# for me?

+9  A: 

I think it's something like this (warning, possible errors ahead. This snippet is intended as a guide, not a solution):

var g = 2 * (gethash(word, good) | 0);
var b = gethash(word, bad) | 0;

if( (g + b) >= 5)
{
    return Math.Max( 
     0.01, 
     Math.Min(0.99, 
      Math.Min(1, b / nbad) / 
      (Math.Min(1, g / ngood) + Math.Min(1, b / nbad))));
}
Gonzalo Quero
Thanks Gonzalo - much appreciated.
Guy
The if condition is off by one: ((g+b) > 4) or ((g+b) >= 5) is correct.According to PG, 5 is just an arbitrary threshold, so it isn't that important.
Nathan Sanders
@Nathan You're right. Corrected.
Gonzalo Quero
BUG: the original code uses infinite precision, so if 'b' and 'nbad' are integers, '(/ b nbad)' in lisp will yield an exact result, but 'b / nbad' in C# will not.. it will truncate. you need to cast one or the other to float first in C#.
Aaron
What's the point in using var here? It makes the code look like JavaScript
Waleed Eissa
I think there's a problem with the code here, as b and nbad are both integers (also g and gbad), the result of the division will always be zero (nbad will normally be much larger than b as it refers to the number of messages/posts), you should cast one of them (or both) to double
Waleed Eissa
+5  A: 

Adding on to Gonzola's answer, don't forget that Lisp provides infinite precision integers and rationals, while C# likes to truncate. You'll need to cast 'nbad' and 'ngood' to float first to get comparable (though not identical) results.

You may also want to put the whole converted program in a checked region. C# doesn't even warn on fixnum overflow -- the first approximation would be to treat overflow as if you're memory constrained (in Lisp, if overflow yeilds too big a number to fit in remaining memory, similar behavior results).

checked {
    var fbad = (double)nbad;
    var fgood = (double)ngood;
    var g = 2 * (gethash(word, good) | 0);
    var b = gethash(word, bad) | 0;


    if( (g + b) >= 5)
    {
        return Math.Max( 
            0.01, 
            Math.Min(0.99, 
                    Math.Min(1, b / fbad) / 
                    (Math.Min(1, g / fgood) + Math.Min(1, b / fbad))));
    }
}
Aaron