views:

80

answers:

0

I need to find lemma of words, I found this code in stanford java doc website, but am not able to find these classes in stanford parser.jar file

AnnotationPipeline, TokenAnnotation,SentenceAnnotation

These are deprecated classes now, what alternate classes to use?

in my understanding lemma of a word is like this

  • word-created,lemma-create

  • word-modifies,lemma-modify

Please correct if am wrong and How to find the lemma of a word, please help

public static void samplePipeline(String text) {
    AnnotationPipeline pipeline = new AnnotationPipeline();
    pipeline.addAnnotator(new PTBTokenizerAnnotator(false));
    pipeline.addAnnotator(new WordsToSentencesAnnotator(false));
    pipeline.addAnnotator(new POSTaggerAnnotator(false));
    pipeline.addAnnotator(new MorphaAnnotator(false));
    pipeline.addAnnotator(new OldNERAnnotator(false));
    pipeline.addAnnotator(new ParserAnnotator(false, false));

    // create annotation with text
    DocumentAnnotation document = new DocumentAnnotation(text);

    // annotate text with pipeline
    pipeline.annotate(document);

    // iterate through sentences, tokens, etc.
    for (SentenceAnnotation sentence: document.get(SentencesAnnotation.class)) {
      Tree tree = sentence.get(TreeAnnotation.class);
      for (TokenAnnotation token: sentence.get(TokensAnnotation.class)) {
        String tokenText = token.get(TextAnnotation.class);
        String tokenPOS = token.get(PartOfSpeechAnnotation.class);
        String tokenLemma = token.get(LemmaAnnotation.class);
        String tokenNE = token.get(NamedEntityTagAnnotation.class);
        ...
      }
    }
  }