Hi, I have a somewhat large document and want to do stop-word elimination and stemming on the words of this document with "Python".Does anyone know an of the shelf package for these? If not a code which is fast enough for large documents is also welcome. Thanks
Yes, use NLTK. It's open source and runs on Windows, Mac, and Linux.
Steven Rumbalski
2010-10-07 15:16:12
+2
A:
If for some reason you don't want to use NLTK, you can try PyStemmer. For stop words just download a list (google it) and filter them out.
lazy1
2010-10-07 16:00:57