views:

117

answers:

3

I need to fetch and show data on a webpage whose number of records may vary based on filters from around 500 records to 1 million records.

Will caching be of any use here as I think million record in memory is not a good thought. SqldataReader?

Paging is a must to implement of course. Showing 1 million records is a worst case scenario(the stupid All filter in use-cases!).

Should I use connected architecture(SqlDataReader) or disconnected architecture(DataSets)?

+5  A: 

First of all, think about it like this: displaying 1 million records makes absolutely no sense to any user. So, you have to think about what the user expects to see. Maybe a summary?! Maybe paginate the records in pages of say 25, or 50 or 100 records. Either of these approaches will not require you to hold 1 M records at a time in memory.

Also, when you run a query against a SQL database and use a SqlDataReader you will not be receiving all the records but instead the SQL driver will send the query to the SQL server, the server will execute the query, prepare a result set and create a forward-only cursor on the server. Then the driver will fetch a record at a time, every time you call Read() on your SqlDataReader. The behavior is very similar if you use LINQ to SQL which uses deferred execution. The result set is not transferred over in full until (or unless) you specifically request each and every row.

So, a simple pagination query will do the trick. Or in other cases some sort of summary report that aggregates the data from those 1 million records one or two pages of relevant data.

Of course if you do need to move back and forth through the pages, some sort of caching might make sense but again, think about it: how often will a user actually want to browse through 1 million records - probably never.

As a last note, if you do implement pagination - make sure that the method you use to implement the pagination relies on the SQL server sending data one page at a time and not reading all 1 million records into ASP.NET and then paginating the local copy of the data because that would be very inefficient and slow. Here is an example of a SQL Server query that performs pagination: SO Question #109232

Miky Dinescu
Nice detailed explaination. Thanks!But still a question: Won't DataSets be of any help here?
Manish
A DataSet works differently. As opposed to the SqlDataReader which has a forward-only behavior, a DataSet has to bring all data from the server locally before operating on it. This makes it easier to use in some cases but if you're query could return 1 million records DatSets are not the solution. In the end it all depends on the query.
Miky Dinescu
+1  A: 

If your server can't cache 1 million records how do you think your user's web browser is going to handle 1 million records worth of HTML coming at it?

Consider paging (here is an example with 1 million records)

Also consider that the user never wants more than about 30 to 50 records. You are either showing them too low level of detail, or you need more filtering.

tster
That link was cool. Sort of what I want. But what is it using in the background..DataSets? I see the retrieval is quite quick.
Manish
@Manish, the background is agnostic to that grid. Using any language you can connect to any database. Then the trick is to write SQL queries which utilize indexes correctly such that returning rows n-m is fast.
tster
+2  A: 

I concur with the rest of the answerers. displaying 1M records is ludicrous. However, you can display the first X records, and page through.

I wrote a blog post on this here. (note: it's using the Telerik RadGrid, but the concept is the same)

The trick is in the Stored Procedure doing the fetching

ALTER PROCEDURE [dbo].[MyHugeTable_GetWithPaging] 
( 
        @StartRowIndex      int, 
        @MaximumRows        int 
) 

AS 
SET NOCOUNT ON 

Select 
    RowNum, 
    [UserName]
From 
    (Select 
        [ID], 
        [UserName]
        Row_Number() Over(Order By [ID] Desc) As RowNum 
        From dbo.[MyHugeTable] t) 
As DerivedTableName 
Where RowNum Between @StartRowIndex And (@StartRowIndex + @MaximumRows) 
rockinthesixstring
Actually the DB is MS-SQL Server 2008.
Manish
Yup, that's what I use.
rockinthesixstring
Nice! Interesting!!
Manish