I'm curious if anyone out there knows of something perhaps like Akismet, but where content doesn't have to go off to a 3rd party server. In a situation with critically sensitive data (patient records for instance) I wouldn't necessarily want that information sent off to another server I don't have control over. I really like Akismet, it works great for the most part. However, I need something more like a local instance of Akismet that's private, and able to be updated semi-regularly. Even better if it works with Python since I need this to interface with Django applications. Should I just go the route of SpamBayes?
Have you looked at Project Honey Pot? I think they have some public querying services that you can use.
I think Project Honey Pot aims is to stop spam before it evens gets to your content processing routine (IP checking/headers analyzing/bot traps and such). It might fits what you're trying to do.
Another one I've heard of is Spamato. It can runs as a standalone proxy, I've never really tried it out though, but you could route content through your its proxy instance and gets the spam filtered.
I can't think of any critically sensitive data that could be submitted by anonymous users. If the data is really sensitive (like you mentioned patient records), it is probably submitted by known and registered user so you should do manual approval of new users and protect the registration part from spammers.