ansaurus

Question

Answer 1

A:

If you just want exact matches use the KeywordTokenizerFactory instead of the StandardTokenizerFactory at query time.

Raoul Duke 2010-08-23 08:14:35

Thank you for quick answer. However while using KeywordTokenizerFactory I don't get any results at all with queries like "foo bar". I tried adding <filter class="solr.StandardFilterFactory"/> to query analyzer but still no changes. I'm running out of ideas..

Daniel 2010-08-23 12:24:34

Answer 2

A:

For both analyzers, the first line should be the tokenizer. The tokenizer is used to split the text into smaller units (words, most of the time). For your need, the WhitespaceTokenizerFactory is probably the right choice.

If you want absolute exact match, you do not need any filter after the tokenizer. But if you do no want searches to be case sensitive, you need to add a LowerCaseFilterFactory.

Notice that you have two analyzers: one of type 'index' and the other of type 'query'. As the names implied, the first one is used when indexing content while the other is used when you do queries. A rule that is almost always good is to have the same set of tokenizers/filters for both analyzers.

Pascal Dimassimo 2010-08-23 12:55:00

thank you, that helped a lot!

Daniel 2010-08-23 14:22:16

Answer 3

A:

Hi, I guess you dont get any results because the tokening is done differently on the data that is already indexed. As Pascal said, whitespaceTokenizer is the right choice in your case. Use it at both index and query time and check the results after indexing some data, not on the previously indexed data.

I suggest using analysis page to see the results with out actually indexing.Its quite useful.Make changes in schema, refresh the core, go to analysis page and look at verbose output to get the step by step analysis.

kaka 2010-08-26 02:28:48

ansaurus

tags:

views:

answers:

Solr query/field analyzer

related questions