views:

148

answers:

2

I know dbsight allows synonyms and stop words for searching but does this take care of inflectional forms of a verb too e.g. for 'swim' it should find swim, swims, swimming, swam, and swum

Link on DBSight Wiki : http://wiki.dbsight.com/index.php?title=User%5Fdictionary

+1  A: 

The behavior you are looking for can be implemented using lemmatization. I am unaware of an existing Lucene analyzer that does this. Basis Tech's Lucene package does lemmatization, but is not free, and I do not know whether it works with dbsight.

Yuval F
Thanks Yuval for pointing this out. Reading the lemmatization wiki, it seems like a stemmer would work too for me. They actually have link for Lucene Snowball Stemmer (http://e-mats.org/2009/05/modifying-a-lucene-snowball-stemmer/) but I'm not sure how that'll work with DBsight
Yasir Laghari
My Bad :) I just found the answer to my question at http://www.dbsight.net/index.php?q=node/395Seems like DBsight comes with analyzers as Snowball-English ( Snowball-Language)
Yasir Laghari
Note that stemming != lemmatization. A stemmer may convert 'swimming' into 'swim' but not 'swam' into 'swim'.
Yuval F
Yep you're right, they're not equal. In the first step, I plan on using stemming and see if it takes care of most searches. If not then definitely I'll look into lemmatization solutions.
Yasir Laghari
A: 

Lucene comes with a stemmer called "Lucene SnowBall stemmer' (http://lucene.apache.org/java/2%5F4%5F0/api/contrib-snowball/index.html). Turns out that DBsight is exposing it as analyzers named SnowBall - [Language] e.g SnowBall - English, SnowBall - French etc..

Yasir Laghari