views:

650

answers:

2

I'm discovering a simple solution for singular-plural keywords searches. I heard about stemming but I don't want to use all its features, only plural/singular transformation. The language is Dutch. Have looked at http://www.snowball.tartarus.org before. Does anyone know the simple solution for singular|plural relevant searches? Thanks in advance.

+2  A: 

Use a dictionary, a list of stopwords (those you don't want to singularize) plus the rules for the language. If you don't know Dutch then I cannot help you, but show you how it'd be done in Spanish, for instance:

  • Plurals end with s, if it doesn't then it's done
    • If it ends with s,
      • check if it's a verb or conjugation ending with s if it is one, then it's done (verbs could be added to the stopwords list)
      • if it's not a verb, remove s
      • if the word exists in the dictionary, done
      • if it doesn't remove the previous letter, and check it in the dictionary.
      • if it's still not there it's an exception you'll need to check manually to code in the exceptions (I cannot right now think of any, but they always exist :)

Of course this will not translate directly to Dutch.

In general stemmers are already done and provide most of what you need, why don't you want them?

Vinko Vrsalovic
A: 

Stemmers caused much user annoyance, so if I use one of them, all functionality except singular/plural should be disabled. So the requirement is to use only plural/singular transformations.

BoBaH32