views:

385

answers:

4

What is the fastest way to select a range of rows, let's say from 4.200.000 to 4.200.050, using SQL 2005? Suppose that I have 10 millions of rows.

On my own projects I use the following approach, but I'm not sure if this is the best practice

select * from
(
 select 
  Column1, Column2, Column3
  RowNumber = row_number() over (order by ID asc) 
 from 
  tblLogs
 where
  Column4 = @Column4 and Column5 = @Column5
    ) as tempTable
where tempTable.RowNumber >= @StartIndex and tempTable.RowNumber <= @EndIndex

With the code above I am tempted to say that tempTable will be a big table with one column containing all my IDs.

Is there anything faster ?

Don't think to make some workarounds using the ID column, this won't work, I delete rows from that table, so my IDs are not successive numbers.

A: 

if you are paging, you can pass in the first & last key for the current page, and limit your derived "tempTable" using those to make it return fewer rows and thus faster.

KM
If you delete rows from table your approach won't work.
pixel3cs
top N... where key>@LastKey
KM
This still wont work if you are sorted by a date, for example. And this can never work if too many items share the same sorted value. This would end up skipping items.
John Gietzen
the OP hasn't given any info on keys or filter criteria, just that this derived table will contain 4 million rows, and he wants 50 of those. if you can't load some sort of value on the current page to help you find the next page, then you deserve to process 4 million rows each time! You could always load 52 rows, 1 for the last item on the previous page and 1 for the first value on the next page and go from there.
KM
+2  A: 

This article over at SQLServerCentral is excellent:
SQL Server 2005 Paging – The Holy Grail

Mitch Wheat
gross sqlservercentral.com
Greg Dean
That is pretty conclusively the most thorough run-down I have seen.
John Gietzen
@Greg Dean: ??? what do you mean? There is some great info at that site.
Mitch Wheat
its one of those anti-SO sites. Bait and switch, to get in search results. Not as bad as experts exchange, but definitely on the list
Greg Dean
It has some rubbish in the forums (well, lots) but great articles. Anything or anyone of interest will eventually turn up here.
gbn
@Greg Dean: Really? I had no idea.
Mitch Wheat
@Mitch yea, clear your cookie(s) and follow your link.
Greg Dean
+1  A: 

I noticed that you have a lot of rows, adding indexes on Column4 and Column5 would increases performances dramatically if not already added.

I found the following article interesting: Ranking Functions and Performance in SQL Server 2005

I will let you figure out how to improved it according to the article if possible. I tested their solutions myself and it works.

If you're looking forward paging in ASP.NET, I also found the following article by Scott Mitchell very interesting: Custom Paging in ASP.NET 2.0 with SQL Server 2005

It used their method in my code and it works just great. Here is a sample of TSQL code:

    SELECT ROWNUM, COLUMN1, COLUMN2, COLUMN3
    FROM (
 SELECT COLUMN1, COLUMN2, COLUMN3,
 ROW_NUMBER() OVER(ORDER BY ID) AS ROWNUM
            FROM TABLE1
 WHERE COLUMN4 = @X AND COLUMN5 = @Y
    ) AS TABLE2
WHERE ROWNUM BETWEEN @startRowIndex AND (@startRowIndex + @maximumRows) - 1

I suggest reading the 4guysfromrolla's article for further information.

Good Luck

Maxime
That article in the first link is a pile of BS. Use a const as ORDER BY expression to eliminate the sort? For one the results are incorrect as the row_number() becomes execution dependent and not reproducible (hey, every 113th visitor on my site sees page 5 with same content as page 11.. gee, I wonder why...). And second the correct solution of using ORDER BY over an *indexed* column is only mentioned in the end as a note.
Remus Rusanu
I found their workarounds interesting and haven't myself used it in real situation but I give it to you that it is probably very hazardous. I carefully added the word "improve [...] according to the article if possible" in case it would prove to be "dangerous". Thanks for pointing this out tho. However, second article and its references pointing at ScootGu's blog entry on typed Datasets is great =).
Maxime
A: 

Ok. This is my final thought about this problem.

For big projects, with tables containing 10 millions of rows or more, I will use this approach:

  select * from
  (
        select 
     myTable.*, 
     RowNumber = row_number() over (order by myTable.ID asc) 
    from 
     myTable
        where
                myCondition
  ) as tempTable
  where tempTable.RowNumber >= @StartIndex and tempTable.RowNumber <= @EndIndex
  • for ASP .NET paging I'll use the SELECT below, wich works very fast for first 100.000 rows, 10.000 pages with 10 rows / page, but from page 10.000 to Infinity the query will work slower and slower, to very slower. No one will want to browse the page 10.001 !!

  • For counting the number of pages and number of rows that fulfill myCondition from the SELECT above, I'll make a special TABLE that will have only one row and one column, on this column I will store the number of rows. Every time I add, modify or delete a row from myTable I will update this colon based on myCondition by adding or decreasing it with value 1. The purpose of making this is to fast select the number of rows that fulfill myCondition and show to my users how many pages I have.

pixel3cs