tags:

views:

1320

answers:

3

I am using Lucene to show search results in a web application.I am also custom paging for showing the same. Search results could vary from 5000 to 10000 or more. Can someone please tell me the best strategy for paging and caching the search results?

Thanks!

+10  A: 

I would recommend you don't cache the results, at least not at the application level. Running Lucene on a box with lots of memory that the operating system can use for its file cache will help though.

Just repeat the search with a different offset for each page. Caching introduces statefulness that, in the end, undermines performance. We have hundreds of concurrent users searching an index of over 40 million documents. Searches complete in much less than one second without using explicit caching.

Using the Hits object returned from search, you can access the documents for a page like this:

Hits hits = searcher.search(query);
int offset = page * recordsPerPage;
int count = Math.min(hits.length() - offset, recordsPerPage);
for (int i = 0; i < count; ++i) {
  Document doc = hits.doc(offset + i);
  ...
}
erickson
Do you still have no performance issue?
acidzombie24
A: 

There are many different paginations people display. My personal flavor was one that was previously on Adobe's search, which looks similarly to this:

/Previous/ | '1' 2 3 4 ... 8096 | Next
Previous | 1 ... 90 '91' 92 ... 8096 | Next
Previous | 1 ... 8093 8094 8095 '8096' | /Next/

Basically, it has 3 cases:

  • Somewhere at the beggining
  • Somewhere at the end
  • Elsewhere

So you'll basically always be displaying the links 'Previous', 'Next', first page, last page and three others. I suggest you generate each of those by a function call tailored for each to generate the HTML. That will keep your code legible and make it easier to tweak or update into something else. Also, it'll make it easier for specifics like applying a '.disabled' CSS class for Previous/Next your at the very beginning/end.

Possibly, something like this [edit sorry, SO keeps cutting off some code, you'll have to go into edit to view the whole code]:


function pagination() {
    return pagePrevious() . pageFirst() . page_2() . page_3() . page_4() . pageLast() . pageNext();
}

function pagePrevious() {
    return (currentPage!=1) ? 'a href="..." class="previous">...' : 'a href="..." class="previous disabled"...';
}

function pageFirst() {
    return (currentPage!=1 and currentPage 

Essentially:

  • You'll have a CSS class for the Previous/Next button and a class to grey them out if they're illegal
  • You'll have a CSS class for displaying the active page
  • You'll need some logic for Previous/Next to know when they're illegal and also some logic when to return the trailing '...' after the first page and the preceding '...' before the link for the last result page
  • For the rest, you'll just have to do an offset from the current page and show which of the page_2, page_3 or page_4 is active
kRON
This doesn't answer the question for Lucene, instead it is a generic pagination on web solution
Khash
A: 

Thanks folks!sincerely appreciate your inputs