views:

123

answers:

2

I'm creating a website where users can write articles and comment on the articles. I want to automatically check to see if a new article or comment is spam.

What are good libraries for doing this?

I looked at bayesian classifier libraries, but it seems that I would have to gather a large amount of samples and classify them all as spam or not spam myself...

I'm looking for something that can hopefully just tell me right out of the box.

UPDATE: Maybe if something like this doesn't exist, does anyone know of a download of a large amount of classifications of spam vs not spam that can be fed into a bayesian classifier?

A: 

Mollom isn't free, but it provides an API too.

http://mollom.com/features

http://mollom.com/pricing

Kevin
A: 

Checkout the Akismet .NET 2.0 Api on CodePlex.

Here's an example from the CodePlex page:

// Verify key
Akismet api = new Akismet("key", "http://url.com", "Test/1.0");
if (!api.VerifyKey()) throw new Exception("Key could not be verified.");

// Create comment object for testing
AkismetComment comment = new AkismetComment();
comment.Blog = "http://joel.net";
comment.UserIp = "147.202.45.202";
comment.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)";
comment.CommentContent = "<a href=\"http://someone.finderinn.com\"&gt;find someone</a>";
comment.CommentType = "comment";
comment.CommentAuthor = "someone";
comment.CommentAuthorEmail = "[email protected]";
comment.CommentAuthorUrl = "http://someone.finderrin.com";

// Test comment against akismet's service
bool isSpam = api.COmmentCheck(comment);

Akismet rocks.

-Charles

Charles