views:

121

answers:

2

I agree with this answer and like it but i really would rather have this solved before going live. So i am starting a bounty in hopes my ass isnt bitten later ;).


With Lucene.NET 2.9.x any version using .NET. How might i search and limit/page the results similar to the limit keyword in SQLite and MySql? I'd like to find the top 20 docs that have the word 'Apple' and have a link to page 20 returning 20 results ignoring the first 400 docs with a higher score. Should i implement it this way (credit goes to Pascal Dimassimo answer below)


1k Question

Hey Guys, i currently have 999 questions so this will be my 1000th question! I just wanted to say thank you to all of you who answered my questions, left me comments and overall help me learn programming and technologies years sooner then it would have taken me alone.

I also want to mention Edward Tanguay who was leading the most question asked for a long time and more importantly ask great questions with many upvotes. I strike to get my quality as high as his. I also want to mention these guys who are asking many questions as well. ooo, metal-gear-solid, Masi, Blankman

+2  A: 

The search method of the Searcher class has a parameter to limit the number of results returned for a query.

Query query = parser.parse("Apple");
TopDocs topDocs = searcher.search(query, 20);

But Lucene does not support pagination. You will have to redo your query and keep the results that fits the range that you need.

See this question.

Pascal Dimassimo
I am now worried about what may happen if my app grows to handle hundred of thousands of docs. I'll drop the 2.9 requirement and hope there is an answer. I'll accept in a few days if there is no other solution. I'm disappointed lucene doesnt support this.
acidzombie24
Lucene is usually really fast when redoing the same search. You should test first to see how it goes for your app.
Pascal Dimassimo
I agree with Pascal, paging is almost always the wrong way to go with Lucene. The time complexity of finding the top n documents grows with log(n) (see http://philosophyforprogrammers.blogspot.com/2010/09/lucene-performance.html), so increasing from 20 to 40 is really a very trivial change.
Xodarap
Interesting article Xodarap. It looks like it is fast. Ok i'll be doing this when the bounty is over.
acidzombie24
+1  A: 

Searcher.search, you use in second line, has also signature:

Search(Query query, Filter filter, HitCollector results)

Use HitCollector to flush temporary result into fast and temporary storage. For example if user asks first 20 - you need to return it and in background thread start caching all another. Really you need to store only document's ID, so for 1 millions result approximately 4Mb is expected.

When result is in the storage it is simple to support paging.

Dewfy