I would do it in C#, but that's only because it's the language that I'm most familiar with at the moment, and because I know it's got strong string handling. It can also be done in C++ with stl::string classes, Ruby, Java, etc.
If I were building a naive bayes classifier, I'd start with a simple example, like the one in Russell & Norvig's book (the one I learned off of way back when, in the second edition of the book) or the one in Mitchell's book (I used his because he taught the class). Make your learner generate rules in a general fashion; that is, given input data, produce output rules, and have the input data be a generalizable thing (could be a block of text that for spam detection, could be a weather report to predict if someone's going to play tennis).
If you're trying to learn Bayes classifiers, a simple example like this is better to start with than a full-blown spam filter. Language parsing is hard in and of itself, and then determining whether or not there's garbage language is also difficult. Better to have a simple, small dataset, one where you can derive how your learner should learn and make sure that your program matches what you want it to do. Then, you can grow your dataset, or modify your program to incorporate things like language parsing.