views:

714

answers:

2

I need a good stemming algorithm for a project I'm working on. It was suggested that I look at the Porter Stemmer. When I checked out the page on the Porter stemmer I found that it is deprecated now in favor of the "Snowball" stemmer.

I need a good stemmer, but I can't really spend significant time implementing (or optimizing) my own. What is the best "off the shelf", freely available stemmer? Are there any non-free stemmers available for a reasonable price? Or, is the Snowball stemmer my best bet?

+1  A: 

It really depends on how you're planning to apply it. The Natural Language Toolkit (http://nltk.sourceforge.net) has a number of stemmers implemented in it that should be able to handle most applications. I prefer the Morphy stemmer.

Of course, it's available in Python, so if you're working with another language, you can always look through the code to glean the algorithm and transfer it to your language of choice. Python is highly readable.

Robert Elwell
+3  A: 

The Porter2 stemmer is the one I've decided to go with. It seemed the porter stemmer was the standard, but when I found the page by the author he recommended the "Snowball (Porter2)" stemmer. There is a C port link on this page.

dicroce