views:

220

answers:

2

I'm working on returning a recordset from SQL Server 2008 to do some pagination. I'm only returning 15 records at a time, but I need to have the total number of matches along with the subset of records. I've used two different queries with mixed results depending on where in the larger group I need to pull the subset. Here's a sample:

SET NOCOUNT ON;
WITH tempTable AS (
  SELECT
     FirstName
     , LastName
     , ROW_NUMBER() OVER(ORDER BY FirstName ASC) AS RowNumber 
   FROM People
   WHERE 
      Active = 1
)

SELECT 
   tempTable.*     
   , (SELECT Max(RowNumber) FROM tempTable) AS Records    
FROM tempTable     
WHERE
   RowNumber >= 1
   AND RowNumber <= 15
ORDER BY
   FirstName

This query works really fast when I'm returning items on the low end of matches, like records 1 through 15. However, when I start returning records 1000 - 1015, the processing will go from under a second to more than 15 seconds.

So I changed the query to the following instead:

SET NOCOUNT ON;
WITH tempTable AS (
  SELECT * FROM (
     SELECT
        FirstName
        , LastName
        , ROW_NUMBER() OVER(ORDER BY FirstName ASC) AS RowNumber 
        , COUNT(*) OVER(PARTITION BY NULL) AS Records
      FROM People
      WHERE 
         Active = 1
   ) derived
   WHERE RowNumber >= 1 AND RowNumber <= 15
)

SELECT 
   tempTable.*     
FROM tempTable     
ORDER BY
   FirstName

That query runs the high number returns in 2-3 seconds, but also runs the low number queries in 2-3 seconds as well. Because it's doing the count for each of 70,000+ rows, it makes every request take longer instead of just the large row numbers.

So I need to figure out how to get a good row count, as well as only return a subset of items at any point in the resultset without suffering such a huge penalty. I could handle a 2-3 second penalty for the high row numbers, but 15 is too much, and I'm not willing to suffer slow loads on the first few pages a person views.

NOTE: I know that I don't need the CTE in the second example, but this is just a simple example. In production I'm doing further joins on the tempTable after I've filtered it down to the 15 rows I need.

A: 

I've handled a situation a bit similar to this in the past by not bothering to determine a definite row count, but using the query plan to give me an estimated row count, a bit like the first item in this link describes:

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=108658

The intention was then to deliver whatever rows have been asked for within the range (from say 900-915) and then returning the estimated row count, like

rows 900-915 of approx. 990

which avoided having to count all rows. Once the user moves beyond that point, I just showed

rows 1000-1015 of approx. 1015

i.e. just taking the last requested row as my new estimate.

davek
Unfortunately that won't take the search criteria into account.
Daniel Short
+2  A: 

Here is what I have done (and its just as fast, no matter which records I return):

--Parameters include:
@pageNum int = 1,
@pageSize int = 0,



DECLARE 
    @pageStart int,
    @pageEnd int

SELECT
    @pageStart = @pageSize * @pageNum - (@pageSize - 1),
    @pageEnd = @pageSize * @pageNum;


SET NOCOUNT ON;
WITH tempTable AS (
    SELECT
        ROW_NUMBER() OVER (ORDER BY FirstName ASC) AS RowNumber,
        FirstName
        , LastName
    FROM People
    WHERE Active = 1
)

SELECT
    (SELECT COUNT(*) FROM tempTable) AS TotalRows,
    *
FROM tempTable
WHERE @pageEnd = 0
OR RowNumber BETWEEN @pageStart AND @pageEnd
ORDER BY RowNumber
Gabriel McAdams
Pretty sure this is the issue right here. `RowNumber` is a derived column and thus not indexed, whereas `COUNT(*)` can use the index. +1.
Aaronaught
You nailed it... processing went from 15-20 seconds to less than 1. Thanks :-)
Daniel Short