I'm using the BayesianClassifier class to classify spam. The problem is that compound words aren't being recognized.
For instance if I add led zeppelin as a match, a sentence containing it won't be recognized as a match even though it should.
For adding a match I'm using addMatch() of SimpleWordsDataSource
And for asking for a match I'm using isMatch() of BayesianClassifier
Any ideas on how to fix this?
Thanks in advance!