Here is an improved version of my previous answer. This one uses regular expression matching to make a fuzzy match on the verb. These all work:
Steve loves Denise
Bears love honey
Maria interested Anders
Maria interests Anders
The regular expression pattern "loves?" matches "love" plus an optional 's'. The pattern "interest.*" matches "interest" plus anything. Patterns with multiple alternatives separated by vertical bars match if any one of the alternatives matches.
import re
re_map = \
[
("likes?|loves?|interest.*", "red"),
("dislikes?|hates?", "blue"),
("knows?|tolerates?|ignores?", "black"),
]
# compile the regular expressions one time, then use many times
pat_map = [(re.compile(s), color) for s, color in re_map]
# We dont use is_verb() in this version, but here it is.
# A word is a verb if any of the patterns match.
def is_verb(word):
return any(pat.match(word) for pat, color in pat_map)
# Return color from matched verb, or None if no match.
# This detects whether a word is a verb, and looks up the color, at the same time.
def color_from_verb(word):
for pat, color in pat_map:
if pat.match(word):
return color
return None
def make_noun(lst):
if not lst:
return "--NONE--"
elif len(lst) == 1:
return lst[0]
else:
return "_".join(lst)
for line in open("filename"):
words = line.split()
# subject could be one or two words
color = color_from_verb(words[1])
if color:
# subject was one word
s = words[0]
o = make_noun(words[2:])
else:
# subject was two words
color = color_from_verb(words[1])
assert color
s = make_noun(words[0:2])
o = make_noun(words[3:])
print "%s -> %s %s;" % (s, o, color)
I hope it is clear how to take this answer and extend it. You can easily add more patterns to match more verbs. You could add logic to detect "is" and "in" and discard them, so that "Anders is interested in Maria" would match. And so on.
If you have any questions, I'd be happy to explain this further. Good luck.