views:

1115

answers:

5

I am looking for a Python library which does Bayesian Spam Filtering. I looked at SpamBayes and OpenBayes, but both seem to be unmaintained (I might be wrong).

Can anyone suggest a good Python (or Clojure, Common Lisp, even Ruby) library which implements Bayesian Spam Filtering?

Thanks in advance.

Clarification: I am actually looking for a Bayesian Spam Classifier and not necessarily a spam filter. I just want to train it using some data and later tell me whether some given data is spam. Sorry for any confusion.

+2  A: 

Try to use bogofilter, I'm not sure how it can be used from Python. Bogofilter is integrated with many mail systems, which means a relative ease of interfacing.

gimel
+5  A: 

Do you want spam filtering or Bayesian classification?

For Bayesian classification there are a number of Python modules. I was just recently reviewing Orange which looks very impressive. R has a number of Bayesian modules. You can use Rpy to hook into R.

Daniel
+9  A: 

Try Reverend. It's a spam filtering module in a single file.

Seun Osewa
+2  A: 

SpamBayes is maintained, and is mature (i.e. it works without having to have new releases all the time). It will easily do what you want. Note that SpamBayes is only loosely Bayesian (it uses chi-squared combining), but presumably you're after any sort of statistical token-based classification, rather than something specifically Bayesian.

Tony Meyer
+1  A: 

A module in the Python natural language toolkit (nltk) does naïve Bayesian classification: nltk.classify.naivebayes.

Disclaimer: I know crap all about Bayesian classification, naïve or worldly.

Paul D. Waite