views:

287

answers:

3

I turned snapshot isolation on in my database using the following code

ALTER DATABASE MyDatabase
SET ALLOW_SNAPSHOT_ISOLATION ON

ALTER DATABASE MyDatabase
SET READ_COMMITTED_SNAPSHOT ON

and got rid off lots of deadlocks.

But my database still produces deadlocks, when I need to run a script every hour to clean up 100,000+ rows.

  • Is there a way I can avoid deadlocks in my cleanup script, do I need to set ROWLOCK specifically in that query?
  • Is there a way to increase the number of row level locks that a database uses?
  • How are locks promoted? From row level to page level to table level?

My delete script is rather simple:

delete statvalue
from statValue,
(select  dateadd(minute,-60, getdate()) as cutoff_date) cd
where temporaryStat = 1
and entrydate < cutoff_date

Right now I am looking for quick solution, but a long term solution would be even nicer.

Thanks a lot, Patrikc

A: 

The way to do this to reduce (or avoid) deadlocks is to delete in batches with a short wait between each batch (using WAITFOR DELAY)

Also, deadlocks can be mitigated by having a covering index.

This code requires a few declares, and it is as an example only (Run at own risk!).

SELECT  @intRowCount = 1,
    @intErrNo = 0

DECLARE @cutoff_date DATETIME

SET @cutoff_date = dateadd(minute,-60, getdate())

SELECT  @intRowsToDelete = COUNT(*) -- Number of rows to be deleted
FROM dbo.statValue 
WHERE temporaryStat = 1
AND  entrydate < @cutoff_date

WHILE @intRowCount > 0 AND @intErrNo = 0
BEGIN

    SET ROWCOUNT @DEL_ROWCOUNT

    delete statvalue
    FROM dbo.statValue 
    WHERE temporaryStat = 1
    AND  entrydate < @cutoff_date

    SELECT @intErrNo = @@ERROR, @intRowCount = @@ROWCOUNT

    SET ROWCOUNT 0 -- Reset batch size to "all"

    SELECT @intRowsToDelete = @intRowsToDelete - @intRowCount

    WAITFOR DELAY '000:00:
 END

I've also included John's suggestion of not repeatedly calculating the date range criteria.

Mitch Wheat
Don't rely on set rowcount. It has problems and is on its way to obsolecense. Just delete top in a loop and break on @@rowcount=0: `while (1=1) begin delete top(@del_count) where ...; if @@rowcount=0 break; end`.
Remus Rusanu
From BOL: "Using SET ROWCOUNT will not affect DELETE, INSERT, and UPDATE statements in the next release of SQL Server"
Remus Rusanu
@Remus Rusanu: Cheers. I'd completely forgotten about that. Though it seems doubtful it will be removed...
Mitch Wheat
@Remus Rusanu: did you mean if @@rowcount=@del_count break; ??
Mitch Wheat
@Mitch: no, break on @@rowcount = 0; You loop until the delete doesn't find anything to delete, then exit the loop. Don't bother tracking the actual deleted count. The filter has to be stable though, it the loop changes the filter every iteration (eg. uses GETDATE inside the loop) then it can stay in the loop ad nauseam because with each iteration one ore two more rows become 'current' and it never exits.
Remus Rusanu
Also, the RWOCOUNT enforcement on delete/update/insert will be removed, I'm pretty sure. It was always broken anyway on insert/update/delete queries complex enough.
Remus Rusanu
A: 

Please re-write your delete query like so:

DECLARE @cutoff_date DATETIME

SET @cutoff_date = dateadd(minute,-60, getdate())

delete statvalue
from statValue
where temporaryStat = 1
and entrydate < @cutoff_date

You will see a cost reduction in your execution plan

John Sansom
+1  A: 

SNAPSHOT isolation can only mitigate (some) of the deadlocks involving reads, but it does absolutely nothing to avoid write vs. write deadlocks. If you generate 100k+ rows per hour that is ~30 inserts per second, so the delete scan is pretty much guaranteed to conflict with other write operations. If all you do is Insert, never update, then the delete block, but not deadlock at row level lock, but because the table is big enough and the delete is doing a scan, the engine will likely choose a page lock for the delete, hence probably the deadlock you get.

w/o an index on the entrydate the delete has no choice but to scan the entire table. This sort of tables that get frequently inserted at the top and deleted at the bottom are in fact queues and your should organize them by the entrydate. That means entrydate should probably be the leftmost key in the clustered index. This organization allows for a clear separation of the inserts occuring at on end of the table vs. the deletes occuring at the other end. But this is a rather radical change, specially if you use the statvalueid to read these values. I guess right now you have a clustered index based on an auto-increment field (StatValueId). Also I assume that the entrydate and the statvalueid are correlated. If both assumptions are true, then you should delete base don the statvalueid: find the largest id that is safe to delete, then delete everything on the clustered index left of this id:

declare @statvalueidmax int;
select @statvalueidmax = max(statvalueid) 
 from statvalue with (readpast)
 where entrydate <  dateadd(minute,-60, getdate());

delete statvalue
 where statvalueid <= @statvalueidmax;

There are a number of assumptions I made, they may be wrong. But the gist of the idea is that you have to separate the inserts from the deletes so they don't overlap.

Remus Rusanu
I like this idea and will try this, but I am not sure if this will help me with the deadlocks that are created. In the future I expect even more data coming into this table.
Patto
There are a lot of factors at play and I have verry little information. But you should try to imagine the table as a line (order by your clustred key left-to-right): your hourly batch job eats away at the left side of the line, inserts lookus and updates occur at the current activity hotspot, on the right side edge. If the two don't meet, no deadlock occurs.
Remus Rusanu