Hi!
I am indexing some files written in spanish in Solr, and sometimes appears chars like ¿D é ....
I wonder if there is some TokenFilter to avoid this chars when the text has accent (á, é, í, ó...)
or letter ñ.
Thanks
Hi!
I am indexing some files written in spanish in Solr, and sometimes appears chars like ¿D é ....
I wonder if there is some TokenFilter to avoid this chars when the text has accent (á, é, í, ó...)
or letter ñ.
Thanks
I added this to my schema.xml
charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
which sould be the solution, but the char are still there. Any other idea?? tx
I added it where every other filters are:
fieldType name="textTight" class="solr.TextField"
positionIncrementGap="100" >
analyzer>
tokenizer class="solr.WhitespaceTokenizerFactory"/>filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
.... !-- Filtro para quitar acentos y ñññ-->
charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/> ....
/analyzer>
/fieldType>
Of course I rebuild my index after that.
(I add this answer, because in the comment it wasn't clear enaugh)