views:

559

answers:

2

We are having some issues with our data layer when large datasets are returned from a SQL server query via a DataReader. As we use the DataReader to populate business objects and serialize them back to the client, the fetch can take several minutes (we are showing progress to the user :-)), but we've found that there's some pretty hard-core locking going on on the affected tables which is causing other updates to be blocked.

So I guess my slightly naive question is, at what point are the locks which are taken out as a result of executing the query actually relinquished? We seem to be finding that the locks are remaining until the last row of the DataReader has been processed and the DataReader is actually closed - does that seem correct? A quick 101 on how the DataReader works behind the scenes would be great as I've struggled to find any decent information on it.

I should say that I realise the locking issues are the main concern but I'm just concerned with the behaviour of the DataReader here.

+3  A: 
  1. During the execution of a query SQL Server can suspend the query if the network buffers are full. This happens if the client does not keep up with reading the network, ie. it does not call SqlDataReader.Read(). The SQL Server query is resumed when network buffers are freed, ie. when the client resumes the SqlDataReader.Read(). This means that while you read your data set result from the data reader, the query is still executing on the server. There are more details, like size of network buffers, BLOB operations on the client side using SqlBytes.Stream and other, but the gist of the idea is that a slow client can cause the query to be suspended and the query ends when the client ends.

  2. When reading data under normal isolation level (read commited) SQL Server will place short lived Shared locks on the rows it reads. Under higher isolation levels the locks are long lived and held untill the end of the transaction.

  3. If no transactions are used, every SELECT statement will create an implicit read-only transation for the duration of the statement.

So from 1, 2, and 3 we can see that a slow client running a query under a high isolation level will cause Shared locks to be held a long time.

Now you need to detail what are you observing:

  • What kind of locks are being held by this query?
    • S, U, X?
    • row, page, table?
    • range locks?
  • Is lock escalation taking place?
  • Why are the locks held for the duration of the query?
    • Do you use REPEATABLE READ or SERIALIZATION isolation levels? If yes, Why?
    • Do you use lock hints? If yes, Why?

Your best bet would probably be snapshot isolation (requires at least SQL Server 2005), either as snapshot isolation level or as read commited snapshot. This will eliminate the lock issue completely, but will create potentially some IO pressure on tempdb.

Other solution would be to use a cursor, but is intrussive on the exiting code base, complex, and still prone to errors (must get the cursor type right).

BTW, I do not recommend changing the client behavior. I assume right now you're marshaling back the business objects as you read them , inside the SqlDataReader.Read loop, and that is the way to do it. Reading upfront into memory and then marshaling may add more problems on large data sets.

Remus Rusanu
Point 1 is interesting: I see it quite often
gbn
Fantastic answer, thanks. The query is executing under read committed isolation level and setting READ COMMITTED SNAPSHOT ON does indeed solve the blocking issue. I'm still curious as to why the locks (most of which are S Page locks) are held for the duration of the query as no lock hints are being used and, according to SQL Profiler, no Lock escalation is taking place.
Graham
A: 

selecting into a temp table would reduce the locking duration

select blah from tbl into #temp << locks held and released

select * from #temp << take all the time you want now

Jack