Does anyone know the easiest way to extract only nouns from a body of text?
I've heard about the TreeTagger tool and I tried giving it a shot but couldn't get it to work for some reason.
Any suggestions?
Thanks Phil
EDIT:
import org.annolab.tt4j.*;
TreeTaggerWrapper tt = new TreeTaggerWrapper();
try { tt.setModel("/Nouns/english.par");
tt.setHandler(new TokenHandler() {
void token(String token, String pos, String lemma) {
System.out.println(token+"\t"+pos+"\t"+lemma); } });
tt.process(words); // words = list of words
} finally { tt.destroy();
}
That is my code, English is the language. I was getting the error : The type new TokenHandler(){} must implement the inherited abstract method TokenHandler.token. Am I doing something wrong?