views:

116

answers:

2

i'm going to running some queries against an SQL Server database, followed by a delete. Ideally all this happens inside a transaction (i.e. atomic).

But practically, because the data has long since been purged from the buffers, SQL Server will have to perform a lot of physical IO in order to complete the transacted T-SQL. This can be a problem, because if the entire batch takes longer than 30 seconds to run, then users will experience timeout problems.

i noticed that if i run my selects in pieces, each time running more and more of the final SQL, letting SQL Server fill the buffers with more and more of the required data. e.g.:

First run:

 BEGIN TRANSACTION
 SELECT ... WHERE ...
 ROLLBACK

Second run:

 BEGIN TRANSACTION
 SELECT ... WHERE ...
 SELECT ... WHERE ...
 ROLLBACK

...

n-th run:

BEGIN TRANSACTION
SELECT ... WHERE ...
SELECT ... WHERE ...
...
SELECT ... WHERE ...
ROLLBACK

And by the time i reach the final run:

BEGIN TRANSACTION
SELECT ... WHERE ...
SELECT ... WHERE ...
...
SELECT ... WHERE ...
DELETE FROM ... WHERE ...
COMMIT

The entire batch runs fast, since the buffers are pre-filled.

Is there a mode of SQL Server (i.e. SET NOEXEC ON) that would cause SQL Server to not perform any actual data modifications, not take any locks, but fill the buffers with needed data? e.g.

SET NOEXEC ON
EXECUTE ThatThingYouDo

SET NOEXEC OFF
EXECUTE ThatThingYouDo

or

SET DRYRUN ON
EXECUTE ThatThingYouDo

SET DRYRUN OFF
EXECUTE ThatThingYouDo
+1  A: 

I have found that whenever you try to do something extremely out of the normal to solve a problem, that you basic design is most likely the problem. This is extremely out of the normal.

Perhaps you could provide more info on the DELETE (table size, activity, index, rows to delete, other processes running, etc) that is taking so long and there will be a conventional solution, using an index or locking , etc to address it.

KM
Lets say it's 10M worth of rows, and i'll be bookmark looking up 3000 of those rows. It's actually a distributed transaction, across linked server, with 20 selects followed by 20 deletes.
Ian Boyd
what do the 20 selects do (return 10M rows)? how are they related to the deletes?
KM
In other words, i'm archiving data out of an OLTP server.
Ian Boyd
Imagine the NSA coming to your bank and saying, "We want all your financial transactions, grouped into buckets where the value of the transaction, calculated at the daily MasterCard exchange rate is modulo the first 50 prime numbers." The database is going to have to start chewing on old data. And "this is extremely out of the normal", so i'm not going to be revising my database structure.
Ian Boyd
A: 

No.

Let's say you have INSERT followed by UPDATE on the inserted rows. You'll never be able to emulate the UPDATE because the inserted rows don't exist.

Now, in this case, why are your selects in a transaction? By default (ie unless you use HOLDLOCK or similar) you are not locking the rows for the duration of the transaction.

If you think the buffer pool (aka data cache is "full") then you need more RAM, basically, or some other upgrade/scale up

gbn
The "problem" you described isn't a problem. a) freshly inserted rows would be in the buffer b) i don't care if not every last row is in the buffers c) i'm not doing inserts into the database that needs its buffers filled. And although i don't **want** to hold the rows for the duration of the transaction, SQL Server can hold locks on parts of a table while a select runs (e.g. locks index until the matching rows have been bookmark looked up.) If the select takes longer than 30 seconds: boom, timeout.
Ian Boyd
Oh, and i don't think the buffers are full, they just have to go sort though a few gigabytes of data.
Ian Boyd
Accepted for "No". If it doesn't have it, then that is the answer.
Ian Boyd