views:

654

answers:

3

I'm using apache solr search engine for indexing my website database..

I'm using django+http://haystacksearch.org/

So let's say I have document that have word "Chicken"

When I search for "chicken" - solr can find this document

But When I search "chick" - it does not find anything..

Is there a way to fix this ?

+1  A: 

If you want to find all words that start with chick, search for chick*.

Chase Seibert
+5  A: 

Note: The following solution is Solr 1.4 (and above) specific!

For more flexibility, I would recommend indexing your data with the NGramTokenizerFactory to do complete front and back wildcard searches. If you just want to search for substrings at the beginning or end of the string, consider using the EdgeNGramTokenizerFactory.

Here's a drop in replacement of the text field type which would accomodate your need:

<fieldType name="text" class="solr.TextField" >
<analyzer type="index">
    <tokenizer class="solr.NGramTokenizerFactory" minGramSize="3" maxGramSize="15" />
    <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
    <tokenizer class="solr.WhitespaceTokenizerFactory" />
    <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
Brian Mansell
solr 1.5 - is this development version ? (not released ?)
Pydev UA
is there solution like this for 1.4 ?
Pydev UA
Good catch: I corrected the answer to reflect 1.4
Brian Mansell
That's really useful... Personally i think i'd create a new fieldtype like the above.. but called ntext or something.. just so you don't mess with the original text fieldtype.
CraftyFella
The NGramTokenizerFactory works great for me (even on 1.3 version)
Pydev UA
A: 

A different approach, if you are having trouble with a small set of words, would be to use the solr.SynonymFilterFactory

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory

You just have to maintain a simple text file that contains synonyms:

chick peep chicken
dawg hound dog
moggie puss kitten cat

Plurals should take care of themselves with other filters.

JP