tags:

views:

263

answers:

4

The Stanford NLP, demo'd here, gives an output like this:

Colorless/JJ green/JJ ideas/NNS sleep/VBP furiously/RB ./.

What do the Part of Speech tags mean? I am unable to find an official list. Is it Stanford's own system, or are they using universal tags? (What is JJ, for instance?)

Also, when I am iterating through the sentences, looking for nouns, for instance, I end up doing something like checking to see if the tag .contains('N'). This feels pretty weak. Is there a better way to programmatically search for a certain part of speech?

+1  A: 

Just from exploring the site, http://nlp.stanford.edu/software/parser-faq.shtml looks like it might have useful information for you, or at least point you in the correct direction for more documentation.

matt b
+3  A: 

They seem to be Brown Corpus tags.

Jonathan Feinberg
No, they are Penn English Treebank POS tags, which are a simplification of the Brown Corpus tag set.
Christopher Manning
A: 

I don't know if you are locked into that Library, or even Java. If you get a chance, check out this book:

Natural Language Processing with Python

Its been very handy for some of my projects. I love Java, but I think NLP needs a far more interactive language to be productive and not go mad. The book uses the Natural Language Toolkit. Lisp is also a beautiful language for NLP, but a little more difficult to set up.

Wish you best of luck.

Off Rhoden
+8  A: 

The Penn Treebank Project. Look at the Part-of-speech tagging ps.

JJ is adjective. NNS is noun, plural. VBP is verb present tense. RB is adverb.

That's for english. For chinese, it's the Penn Chinese Treebank. And for german it's the NEGRA corpus.

  1. CC Coordinating conjunction
  2. CD Cardinal number
  3. DT Determiner
  4. EX Existential there
  5. FW Foreign word
  6. IN Preposition or subordinating conjunction
  7. JJ Adjective
  8. JJR Adjective, comparative
  9. JJS Adjective, superlative
  10. LS List item marker
  11. MD Modal
  12. NN Noun, singular or mass
  13. NNS Noun, plural
  14. NNP Proper noun, singular
  15. NNPS Proper noun, plural
  16. PDT Predeterminer
  17. POS Possessive ending
  18. PRP Personal pronoun
  19. PRP$ Possessive pronoun
  20. RB Adverb
  21. RBR Adverb, comparative
  22. RBS Adverb, superlative
  23. RP Particle
  24. SYM Symbol
  25. TO to
  26. UH Interjection
  27. VB Verb, base form
  28. VBD Verb, past tense
  29. VBG Verb, gerund or present participle
  30. VBN Verb, past participle
  31. VBP Verb, non­3rd person singular present
  32. VBZ Verb, 3rd person singular present
  33. WDT Wh­determiner
  34. WP Wh­pronoun
  35. WP$ Possessive wh­pronoun
  36. WRB Wh­adverb
anno