views:

46

answers:

2

Are there any "intelligent" or "learning" engines out there, that are able to identify "evil" phrases in texts ( maybe something like a learning Spamfilter... e.g. used in Thunderbird? )

For example if i want to filter texts with mailadresses:

asdasd asd as d dgfdgfdgfdg sadasd(at)asfsdf.com

At first the tool wouldn't recognize this as an emailadress... but if the user "teached" ( clicked a "text contains an mailadress"-button for example ) the tool several times, that text which contains phrases like "xxxxx(at)xxxxx.xx" is suspicious, it "learns" that it should mark these text automatically in the future...

Question: Is there anything like it on the market? I foudn some libs ( like SpamAssasin, etc. ) but these are "specialized" on emails...

+2  A: 

The general idea you are talking about is a Bayesian filter. Maybe that will help you in your searches.

Edit: A few other examples:

Adam W
+1  A: 

Yeah, this seems to be good start: http://nbayes.codeplex.com/ ( C# implementation of the bayesian algorithm )

David