views:

92

answers:

2

We currently have a system that displays a page tabular data on the screen without any paging support in the user interface. It is running on Java 1.5 / Spring JDBC / T-SQL stored procs / SQLServer 2000 stack.

In the absence of the ability to skip rows in the result set (limitation of SQLServer 2K without using dynamic SQL); I am exploring the option of having the data layer select all rows and have the DAO layer scroll through the skipped pages of rows and then only read off a page worth of rows.

My question is this:

How much of a performance gain (in terms of DB CPU and I/O) is this change going to be compared with the current state where all rows are returned?

I know that there will be only a page worth of data going over the wire between the DB and the Application, but I am interested to know what will happen inside the DBMS. Assuming the query plan is already cached, is the DBMS going to skip the processing of the first 40 pages of results if i only want page 41?

I guess I'm looking to know if we incur much of a cost even though the cursor is going to skip the first x pages of the result set.

+1  A: 

Why not use a unit test to get some numbers.

So, you start with your current setup, and have 5 tests, skip 0, 2, 4, 6, 8 pages, and see if there is a difference between skipping 8 and 2 pages.

Then, once you have a baseline, why not use dynamic SQL and return just the rows that are of interest.

Write another test and see what happens.

Then, have a stored procedure that selects everything but just returns the rows of interest, and have another test for that. for your Then, try the test with the idea of the DAO doing the filtering.

It is difficult to give any real idea as to the performance hit the last one will have, as there are many factors that we don't have, but I expect that the more work you can have the stored procedure do the faster you will go, overall.

I tend to find unit tests useful to see what is the best option, as you can then compare it under load, look at what happens to the CPU and memory. You can measure whatever is important to determine which option will be best for your design.

James Black
Thanks James, a valid approach for sure. I guess I'm looking for a more theoretical rather than empirical answer. I want to know if the is the DBMS going to actually read data off the rows even if the cursor scrolls past it.
anger
@bart - I think it is because of my engineering background, but before I make a decision I like numbers, but you may want to modify your answer adding your comment here, as that sounds like the real question, the rest was just filler, IMO.
James Black
+1  A: 

If you have a BTree (index, clustered or non-clustered) then the only way to go to page X is to know a key on the page and seek straight to it. Every other mean to 'skip' first X-1 pages will have to go trough the all the pages from 1 to X and skip each record individually. A narrow index on the 'paged' field can help to count, as high density slots (hence the narrow index) reduce the number of pages that have to be scanned to find the row that starts the page X.

Remus Rusanu
Thanks Remus, I recognize that the cursor that drives the data access is going to have to skip to the row that corresponds to the first row of page x. When we use JDBC to scroll to row (x * pagesize ) does this mean that the SQLServer is going to read from the disk all of the blocks for rows from page 0 to page x-1?
anger
If you want to skip 1000 rows, SQL has to count 1000 rows on 'disk' so it can locate the row you're interested. I say 'disk' because is not necessarily disk access, it can be cached in the buffer pool.
Remus Rusanu
Thanks Remus - I've got it now.
anger