ansaurus

Question

Answer 1

+1 A:

You are probably using a WordDelimiterFilterFactory with splitOnNumerics activated. Check the analyzers of the field you are storing this data into.

Pascal Dimassimo 2010-10-04 13:18:42

I have indeed the WordDelimiterFilter define here:<fieldType name="textTight" class="solr.TextField" positionIncrementGap="100" > <analyzer> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0" catenateWords="1" catenateNumbers="1" catenateAll="0"/>...but a) splitOnNumerics is not activated, and b) it is only defined on the fieldType "textTight" which AFAIK I am not using

Carsten Gehling 2010-10-05 13:29:27

Oh bloody comment format... :-) Here's a Pastie with my entire schema.xml: http://pastie.org/1200681

Carsten Gehling 2010-10-05 13:32:17

The 'text' fieldType uses a LetterTokenizerFactory. According to the doc, any non-letter characters will be discarded when using that tokenizer. http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.LetterTokenizerFactory

Pascal Dimassimo 2010-10-05 14:29:16

Oh... Well still I have a lot to learn about Solr. :-) Which tokenizer would you recommend for a "simple" text-field?

Carsten Gehling 2010-10-05 19:29:50

The StandardTokenizerFactory or WhitespaceTokenizerFactory are usually good choices. Then proceed with some filters. It all depends on your need. Check the schema.xml provided in the example folder of the latest Solr package. Check also the example here: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#Specifying_an_Analyzer_in_the_schema

Pascal Dimassimo 2010-10-05 19:55:28

Thanks a bunch. I've changed to the StandardTokenizerFactory (and its "sibling" StandardFilter) and I am now reindexing my data. I look forward to see the result.

Carsten Gehling 2010-10-05 20:03:50

Just wanted to let you know, that I have now reindexed my data during the night with the StandardTokenizer, and now everything works as expected. Thank you very much.

Carsten Gehling 2010-10-06 05:17:53

ansaurus

tags:

views:

answers:

Using integers in text-query

related questions