views:

71

answers:

1

I have a program written in asp.net with lucene.net. At first I create an index from 28000 documents. Secondly I'm executing a search, but sometimes there is an error. (I think this error is thrown when there are many results)

The important part of code:

Dim hits As Hits = searcher.Search(query)
Dim results As Integer = hits.Length()  'ergebnisse (größe der hits)

'#####################
'####### RESULTS #####
'#####################

trefferanzahl = results

If (results > 0) Then  
    Dim i As Integer
    Dim h As Integer = results - 1
    ReDim array_results(h, 6) 'array zum speichern von den "feldern"
    Dim cellX As New TableCell()

    For i = 0 To results - 1 Step 1 

        Dim tmpdoc As Document = hits.Doc(i)   ' HERE THE ERROR!
        Dim score As Double = hits.Score(i)     

        MsgBox("2. Docname: " & hits.Doc(i).Get("title"))


        array_results(i, 0) = tmpdoc.Get("title")
        array_results(i, 0) += tmpdoc.Get("doc_typ") 
        array_results(i, 1) = tmpdoc.Get("pfad")
        array_results(i, 2) = tmpdoc.Get("date_of_create")
        array_results(i, 3) = tmpdoc.Get("last_change")
        array_results(i, 4) = tmpdoc.Get("id")
        array_results(i, 5) = tmpdoc.Get("doc_typ")
        array_results(i, 6) = CStr(score)
    Next

    ' Load this data only once.

    ItemsGrid.DataSource = CreateDataSource()   
    ItemsGrid.DataBind()
Else
    bool_Suchergebnis = False
End If

searcher.Close()

Thanks in advance

+2  A: 

A good principle when performing searches accross very large collections is to limit the results that you are processing as soon as possible. I will assume that you are implementing paging in your grid. And lets assume that PageSize is 20.

What you need to do is make sure that you have access to the PageSize and the current PageNo within this method. Then use Linq accross the result set to Take(PageSize) and Skip (PageNo * PageSize). Then you will only have to process 20 records.

Then, you have two options. If you are binding directly to the array, you might be able to get away with empty items, but I am not sure, so you might have to place dummy items into the datasource array in all positions that won't be displayed. Not ideal, but certainly quicker than processing 1000s of Hits.

The second option is to bind only the 20 items to the grid, which will be quick, switch off paging on the grid as it will only show one page and then implement your own paging behaviour as you know the PageSize, and the current PageNo. This will take more work but it will perform a lot faster than the out-of-the-box gridview binding to a large datasource.

And it will help you solve your memory problem.

Daniel Dyson
thank you for the very helpfull answer, i wait for the access to change the programm on the server... when i'm able to do this, i check your comment. And, i have to limit the results due to useing multiple pages like google for example? thanks a lot
tim