tags:

views:

2071

answers:

2

Hi,

I'm working on a .Net application which uses Asp.net 3.5 and Lucene.Net I am showing search results given by Lucene.Net in an asp.net datagrid. I need to implement Paging(10 records on each page) for this aspx page.

How do I get this done using Lucene.Net?

Thanks in advance...!

+13  A: 

Here is a way to build a simple list matching a specific page with Lucene.Net. This is not ASP.Net specific.

int first = 0, last = 9; // TODO: Set first and last to correct values according to page number and size
Searcher searcher = new IndexSearcher(YourIndexFolder);
Query query = BuildQuery(); // TODO: Implement BuildQuery
Hits hits = searcher.Search(query);
List<Document> results = new List<Document>();
for (int i = first; i <= last && i < hits.Length(); i++)
    results.Add(hits.Doc(i));

// results now contains a page of documents matching the query

Basically the Hits collection is very lightweight. The cost of getting this list is minimal. You just instantiate the needed Documents by calling hits.Doc(i) to build your page.

David Thibault
I would suggest using something like Memcache or another in memory store keyed on the search term itself. That way you dont need to requery but investigate if this actually improves performance.
bleevo
I feel like folks are missing the point here. I think the point here is how to turn the Lucene results into a format that works well with the ASP.NET datagrid. The ASP.NET datagrid is designed to work well with .NET ADO datasets (although there are other ways to use it). My answer shows a way of converting from Lucene objects to ADO.NET objects.If you ignore the specifics of the DataGrid, I think you aren't answering the question.
Corey Trager
The datagrid works perfectly well with a List<T>. Just create the list from the info contained in the Document objects and DataBind it to the grid.
David Thibault
Ok - I retract my comment.
Corey Trager
A: 

What I do is iterate through the hits and insert them into a temporary table in the db. Then I can run a regular SQL query - joining that temp table with other tables too - and give the grid the DataSet/DataView that it wants.

Note that I do the inserts and the query in ONE TRIP to the db, because I'm using just one SQL batch.

void Page_Load(Object sender, EventArgs e)
{

 dbutil = new DbUtil();
 security = new Security();
 security.check_security(dbutil, HttpContext.Current, Security.ANY_USER_OK);

 Lucene.Net.Search.Query query = null;

 try
 {
  if (string.IsNullOrEmpty(Request["query"]))
  {
   throw new Exception("You forgot to enter something to search for...");
  }

  query = MyLucene.parser.Parse(Request["query"]);

 }
 catch (Exception e3)
 {
  display_exception(e3);
 }


 Lucene.Net.Highlight.QueryScorer scorer = new Lucene.Net.Highlight.QueryScorer(query);
 Lucene.Net.Highlight.Highlighter highlighter = new Lucene.Net.Highlight.Highlighter(MyLucene.formatter, scorer);
 highlighter.SetTextFragmenter(MyLucene.fragmenter); // new Lucene.Net.Highlight.SimpleFragmenter(400));

 StringBuilder sb = new StringBuilder();
 string guid = Guid.NewGuid().ToString().Replace("-", "");
 Dictionary<string, int> dict_already_seen_ids = new Dictionary<string, int>();

 sb.Append(@"
create table #$GUID
(
temp_bg_id int,
temp_bp_id int,
temp_score float,
temp_text nvarchar(3000)
)
 ");

 lock (MyLucene.my_lock)
 {

  Lucene.Net.Search.Hits hits = null;
  try
  {
   hits = MyLucene.search(query);
  }
  catch (Exception e2)
  {
   display_exception(e2);
  }

  // insert the search results into a temp table which we will join with what's in the database
  for (int i = 0; i < hits.Length(); i++)
  {
   if (dict_already_seen_ids.Count < 100)
   {
    Lucene.Net.Documents.Document doc = hits.Doc(i);
    string bg_id = doc.Get("bg_id");
    if (!dict_already_seen_ids.ContainsKey(bg_id))
    {
     dict_already_seen_ids[bg_id] = 1;
     sb.Append("insert into #");
     sb.Append(guid);
     sb.Append(" values(");
     sb.Append(bg_id);
     sb.Append(",");
     sb.Append(doc.Get("bp_id"));
     sb.Append(",");
     //sb.Append(Convert.ToString((hits.Score(i))));
     sb.Append(Convert.ToString((hits.Score(i))).Replace(",", "."));  // Somebody said this fixes a bug. Localization issue?
     sb.Append(",N'");

     string raw_text = Server.HtmlEncode(doc.Get("raw_text"));
     Lucene.Net.Analysis.TokenStream stream = MyLucene.anal.TokenStream("", new System.IO.StringReader(raw_text));
     string highlighted_text = highlighter.GetBestFragments(stream, raw_text, 1, "...").Replace("'", "''");
     if (highlighted_text == "") // someties the highlighter fails to emit text...
     {
      highlighted_text = raw_text.Replace("'","''");
     }
     if (highlighted_text.Length > 3000)
     {
      highlighted_text = highlighted_text.Substring(0,3000);
     }
     sb.Append(highlighted_text);
     sb.Append("'");
     sb.Append(")\n");
    }
   }
   else
   {
    break;
   }
  }
  //searcher.Close();
 }
Corey Trager
I would discourage this practice. Way to many moving parts. Also, hitting the DB for an index search kinds of defeats the purpose.
David Thibault
@David - The details: My db contains data (bugs, tickets, issues) with columns that change frequently. Either I update BOTH the db AND the Lucene index whenever these columns change, OR, I update ONLY the db. I chose to update only th db. Another detail: The part of my app that displays the results existed already and expected a .NET DataSet/DataView object as input. So, I wanted to translate the results into a DataSet/DataView anyway. The OP wants to display results in a datagrid which also wants a DataSet/DataView as input. Don't just negate. Suggest a better altnerative.
Corey Trager
The solution I presented answers the question of how to do paging with lucene. Binding the results to a grid is then easy (see my reply to your other comment).
David Thibault
ok. Clear. Thanks.
Corey Trager