ansaurus

Question

Answer 1

A:

Why don't you use Searcher.search(Query query, int n) ? You can specify the number of results you want back, and you can use the TopDocs object that is returned to iterate through the results.

Using Hits to process long result sets was a bad idea, because in the background the hits object would run more searches to fill in results that it didn't already have.

TopDocs only contains ids and scores, so you shouldn't have a memory problem even for large n.

bajafresh4life 2010-07-21 15:30:37

That is basically what I'm currently doing. But what if I need result number n+1?

Kris 2010-07-21 15:37:33

Just ask for N + M where M is some kind of constant value. I think you're worrying too much about memory here; TopDocs only contains scores and id's which is almost no memory at all, even for large N. If you don't believe me, run a profiler to find out.

bajafresh4life 2010-07-22 13:42:26

Answer 2

A:

How about using NumDocs from the index reader as the maximum number of results.

Do watch out for the edge case of zero documents in the index though...

Hope this helps,

Moleski 2010-07-21 17:59:08

Answer 3

A:

IndexSearcher has a method docFreq(Term). Invoking it does not seem to have a performance penalty and its output is then a suitable input parameter for the number of documents to get.

E.g.

int freq = searcher.docFreq(new Term(FIELD, value));
TopDocs hits = indexSearcher.search(query, freq);
for (int i=0 ; i<hits.totalHits ; i++) {
   // Process hit
}

This works because my query is essentially a TermQuery. If it was a more complex query then this wouldn't be suitable.

Kris 2010-07-22 14:06:46

ansaurus

tags:

views:

answers:

Lucene 3 iterating over all hits

related questions