views:

223

answers:

4

I have a question about why some SQL (running on SQL Server 2005) is behaving the way it is. Specifically, I made a change to reduce lock contention during an update and it seems to be working in cases I didn't think it would.

Original code:

We had an update statement like this, that was being applied to a table with more than 3,000,000 records:

UPDATE  USER WITH (ROWLOCK)
SET Foo = 'N', Bar = getDate()
WHERE ISNULL(email, '') = ''
AND   Foo = 'Y'

As you can probably guess, this seemed to lock up the USER table for a while. Even with the ROWLOCK hint, other jobs running queries and updates against USER would block until this was done. That's not acceptable for this particular application, so I thought I'd apply a trick I read about by having the update statement only update 100 records at a time. This would give other queries a chance to get at the table occasionally.

Improved code:

DECLARE @LOOPAGAIN AS BIT;
SET @LOOPAGAIN = 1;

WHILE @LOOPAGAIN = 1
  BEGIN
    UPDATE TOP (100) USER WITH (ROWLOCK)
    SET Foo = 'N', Bar = getDate()
    WHERE ISNULL(email, '') = ''
    AND   Foo = 'Y'

    IF @@ROWCOUNT > 0
      SET @LOOPAGAIN = 1
    ELSE
      SET @LOOPAGAIN = 0
  END

This did the trick. Our update did its work and other queries were able to get at the table. All is happiness and light.

The Mystery:

I understand how this improved the performance when there were many records in the table it had to update. By taking a quick run through the loop after every 100 updates, it gave other queries a chance to get at the table. The mystery is that this loop had the same effect even when there were no records affected by the update!

The second time we would run our original query, it would run for only a fraction of the time (say 30 seconds or so), but it would lock up the table during that time even though no records were being changed. But put the query in the loop with the "TOP (100)" clause, and even though it took just as long to do nothing, it freed up the table for other queries!

I'm very surprised at this. Can anyone tell me:

  1. If what I just said is at all clear and,
  2. Why the second block of code allows other queries to get at the table even when there are no records being updated?
A: 
  1. It is clear.
  2. These conditions are VERY HEAVY ISNULL(email, '') = '' AND Foo = 'Y'.

The update seeks all the rows that might need to be updated, that's why it takes always the same time even if there are no rows to be updated.

This is a blind shot, but you should consider adding an Index to both the Email and the Foo fields (not one index each, one for both of them).

Is this the only heavy query you do on this table? Which indexes are in this table?

Alex Bagnolini
And you could further improve the index use by removing the `ISNULL` call in the `WHERE` clause. The optimiser will prefer something like `WHERE (email IS NULL OR email = '') AND foo = 'Y'`.
LukeH
@Luke: Is this real? isnull(email,'') = '' is NOT interpreted by it self as (email is null or email = '')?
Ice
A: 

It seems like SQL Server is choosing a different lock based on the TOP 200, even though you specify ROWLOCK. Can you see any difference in Management -> Activity Montior, under Locks by Object ?

Andomar
A: 

You should also consider refactoring to an update from if with large tables, in my experience they perform better:

UPDATE USER Foo = 'N', Bar = getDate() FROM (SELECT USER.ID FROM USER // optional NOLOCK Hint if you don't care about read uncommitted. WHERE COALESCE(EMAIL, '') = '' AND Foo = 'Y') D WHERE D.ID=USER.ID

kmacmahon
+1  A: 

This sounds like a classic case of lock escalation.

In the first scenario you are updating what looks like it could be a lot of records from your 3,000,000 row table. There are two important things to consider:

  1. SQL Server 2005 will escalate your lock when 5,000 locks are acquired on a single table or index. There are caveats and exceptions to this so see Lock Escalation (Database Engine) for more information.
  2. Lock hints such as ROWLOCK do not prevent lock escalation.
  3. "The Database Engine does not escalate row or key-range locks to page locks, but escalates them directly to table locks."

So based on the above and your description, I would guess that your query, which is attempting to lock rows, is being escalated to a table level lock and is blocking all access to the User table. You notice this blocking because the update is taking a long time since the table is large.

Recommendations to avoid lock escalation are:

  1. Break up large operations to smaller operations (you've done this!).
  2. Tune your query to be as efficient as possible.
  3. As a last resort you can set trace flag 1211 to disable lock escalation (not recommended!).

See How to resolve blocking problems that are caused by lock escalation in SQL Server for more details.

If you want to verify that lock escalation is what is going on you can use the use the SQL Server Profiler and look at the Lock:Escalation event.

Tuzo
The lock escalation makes a lot of sense. But if that was the case, I would expect the lock escalation to take place both with the original statement and with the statement in the loop.Could it be smart enough in the statement with the loop to never escalate to a table-level lock since it knows it would change at most 100 rows?
Mark