views:

71

answers:

1

Hey!

I want to colorize the words in a text according to their classification (category/declination etc). I have a fully working dictionary, but the problem is that there is a lot of ambiguity. foedere, for instance, can be forms of either the verb "fornicate" or the noun "treaty".

What the general strategies for solving these ambiguities or generating good guesses are?

Thanks!

+2  A: 

The general strategy is to first run a part-of-speech tagger on the data to determine the word category (noun, verb, etc.). That, however, requires data (context statistics) and tools. This research paper may be a starting point.

larsmans