tags:

views:

53

answers:

1

I just took over our solr/lucene stuff from my ex-colleague. But there is a weird bug.

If there is no optimization after dataimport, actually if there are multiple segment files, the search result then will be wrong. We are using a customized solr searchComponent. As far as I know about lucene, optimization should not affect search result. I doubt this may be related to multithreading or unclosed searcher/reader or something.

Anybody can help? Thank you.

A: 

It's still a guess. I find there is a custom lucene filter which is used by the custom search component. And in that filter, SolrIndexSearcher.search is called against the filter queries. Chances are high that this is the damn cause.

Could be a hint for guys who are familiar with lucene.

KailZhang
Found one interesting post which mentioned filter, segment, etc.http://www.gossamer-threads.com/lists/lucene/java-user/97270
KailZhang
I think I'm very close to the truth. In the search component's process, lucene.indexsearcher.search is called rather than solr's search. If I replace that search with solr's, then the result turns to be correct. So my task now is to rewrite the code to use solrs(old code uses docs return from lucene's search.).
KailZhang
Well, I came back to disclose my discovery. Now I'm quite sure of what happened. SolrIndexSearcher.getDocSet is called in our filter, but this getDocSet will scan the whole index, which means all the segments, while filter is called upon each segment. This means, if there are 8 segments, there will be 8*8 scans. And please also remember that in filter, doc id is maintained in segment. So beware of this when you wanna write custom lucene filter.
KailZhang