According to the NLTK book, I first apply the grammar, and parse it.
grammar = r"""
NP: {<DT|PP\$>?<JJ>*<NN>}
{<NNP>+}
"""
cp = nltk.RegexpParser(grammar)
chunked_sent = cp.parse(sentence)
When I print chunked_sent, I get this:
(S
i/PRP
use/VBP
to/TO
work/VB
with/IN
you/PRP
at/IN
(NP match/NN)
./.)
I don't want to just look at it. I want to actually pull out the "NP" noun phrases.
How can I print out "match"...which is the noun phrase? I want to get all "NP" out of that chunked_sent.
for k in chunked_sents:
print k
(u'i', 'PRP')
(u'use', 'VBP')
(u'to', 'TO')
(u'work', 'VB')
(u'with', 'IN')
(u'you', 'PRP')
(u'at', 'IN')
(NP match/NN)
(u'.', '.')
for k in chunked_sents:
print k[0]
i
use
to
work
with
you
at
(u'match', 'NN')
See, for some reason, I lose the "NP".
Also, how do I determine if k[0] is a string or tuple (as in the case above)