views:

72

answers:

2

I'm curious if anyone out there knows of something perhaps like Akismet, but where content doesn't have to go off to a 3rd party server. In a situation with critically sensitive data (patient records for instance) I wouldn't necessarily want that information sent off to another server I don't have control over. I really like Akismet, it works great for the most part. However, I need something more like a local instance of Akismet that's private, and able to be updated semi-regularly. Even better if it works with Python since I need this to interface with Django applications. Should I just go the route of SpamBayes?

+3  A: 

Have you looked at Project Honey Pot? I think they have some public querying services that you can use.

I think Project Honey Pot aims is to stop spam before it evens gets to your content processing routine (IP checking/headers analyzing/bot traps and such). It might fits what you're trying to do.

Another one I've heard of is Spamato. It can runs as a standalone proxy, I've never really tried it out though, but you could route content through your its proxy instance and gets the spam filtered.

chakrit
Project Honey Pot's blacklist is what I ended up going with. Seems like it just might work for what I need.
f4nt
A: 

I can't think of any critically sensitive data that could be submitted by anonymous users. If the data is really sensitive (like you mentioned patient records), it is probably submitted by known and registered user so you should do manual approval of new users and protect the registration part from spammers.

fest
This isn't always the case. I agree that it should be, and if I called all the shots in the world I'd agree with you. Unfortunately, I don't always get to write the specs :)
f4nt
Yeah, developing software by solely relying on specs is hard.However, can't you just get through by using some unobtrusive anti-spam measures, like the ones used by django comments framework http://docs.djangoproject.com/en/dev/ref/contrib/comments/#notes-on-the-comment-formOtherwise- just use SpamBayes ;)
fest