views:

994

answers:

6

I am having timeout issues and deadlocks from time to time with some long running queries.

I'm wondering when is it most appropriate to use NOLOCK and where?

Do I use it on the updates & inserts? or reads?

+1  A: 

Use it when it is acceptable to have dirty reads and phantom records I.e you may have non critical reports running regularly where the accuracy of the information is not a primary driver but having a view on volume of records is, or some other metric, for example

Russ Cam
The fall out of failed transactions can be pretty bad too. If you study nolock, you are not in the ideal situation to begin with.
Marco van de Voort
+5  A: 

Note that you can specify nolock on a per table basis.

I typically used nolock in complex SELECT queries, but only for the little lookup tables that almost never changed, and for display-only data. You know the tables that list the prices for the current half year, or lookups of ids to strings etc. Stuff that only changes with major updates after which the servers are usually restarted routinely anyway.

This improved performance significantly, reduced the chance of deadlock in the busiest times, and more importantly it was really noticable during the worst case moments for queries that touched a lot of tables (which is logical, they have to obtain less locks, and those sidetables are often used nearly everywhere, often decreasing from 7-8 to 4 tables that need to be locked)

But be very careful adding it, don't rush it, and don't do it routinely. It won't hurt when used properly, but it will hurt horribly when used improperly.

Don't use it for highly critical stuff, stuff that calculates etc, because it will get inconsistent, anything that leads to a write sooner or later.

Another such optimization is ROWLOCK, which only locks on row level. This is mainly useful when updating (or deleting in) tables where the rows are not related to eachother, like tables where you only put in log records (and the order in which they are inserted doesn't matter). If you have a scheme that somewhere in the end of an transaction a log record is written to some table, this can speed up considerably too.

If your database has a relatively low percentage writes it might not be worth it. I had a read:write ratio of under 2:1.

Some URLs I saved when working on this:

http://www.developerfusion.com/article/1688/sql-server-locks/4/

Marco van de Voort
+2  A: 

You should use nolock when it is ok to read dirty data. A large transaction that may make a number of changes to the database may still be in progress, using nolock will just return the data it has set so far. Should that transaction then rollback the data you are looking at could be wrong. Therefore, you should only use it when it doesn't matter that what you get back could be wrong.

Deadlocks are a common problem, but 9 times out of 10 are entirely caused by a developer problem. I would concentrate on finding the cause of the deadlocks rather than using nolock. It is more than likely just one transaction doing things in a different order to all the others. Fixing just that one may make all your issues vanish.

Robin Day
+3  A: 

There are four transaction isolation levels in SQL Server:

  1. READ UNCOMMITTED
  2. READ COMMITTED
  3. REPEATABLE READ
  4. SERIALIZABLE

For the tables it's applied to, NOLOCK is the equivalent of "read uncommitted". That means you can see rows from transactions that might be rolled back in the future, and many other strange results.

Still, nolock works very well in practice. Especially for read-only queries where displaying slightly wrong data is not the end of the world, like business reports. I'd avoid it near updates or inserts, or generally anywhere near decision making code, especially if it involves invoices.

As an alternative to nolock, consider "read committed snapshot", which is meant for databases with heavy read and less write activity. You can turn it on with:

ALTER DATABASE YourDb SET READ_COMMITTED_SNAPSHOT ON;

It is available for SQL Server 2005 and higher. This is how Oracle works by default, and it's what stackoverflow itself uses. There's even a coding horror blog entry about it.

P.S. Long running queries and deadlocks can also indicate SQL Server is working with wrong assumptions. Check if your statistics or indexes are out of date:

SELECT 
    object_name = Object_Name(ind.object_id),
    IndexName = ind.name,
    StatisticsDate = STATS_DATE(ind.object_id, ind.index_id)
FROM SYS.INDEXES ind
order by STATS_DATE(ind.object_id, ind.index_id) desc

Statistics should be updated in a weekly maintenance plan.

Andomar
A: 

For a transactionally consistent view without read locks recommend enabling snapshot isolation in SQL Server.

This is slightly different than NOLOCK in that when you read information the results always reflect a version of committed data rather than the possibility of viewing uncommited data. This provides the same locking concurrency as NOLOCK (no "read" locks) with clearer results.

One should always keep in mind even with transactional consistency the data you then go on to display or use at the time of it being displayed can possibly be wrong or outdated anyway. I've seen too many people assume that if they use the data fast enough or if they use it within a query/transaction that it's OK. This is absurd -- it is my opinion repeatable consistency levels should never have been implemented in the first place as it just encourages bad behavior. They do not exist in Oracle.

Personally I'm fond of disabling locking for certain non-critical data views and reports as it puts less of a load on the system and the small proboablitiy of providing slightly inaccurate results is not an issue.

Taking advantage of repeatable read consistency levels and committing sins such as holding open transactions for user input might be a little easier on the developer in terms of initial development but will almost always lead to major road"blocks" to any hope of reasonably scaling your application.

My view is the best approach is always to "double check" conditions that still must be true in order to apply updates to any data.

Bad:

UPDATE myaccount SET balance = 2000

Better:

UPDATE myaccount SET balance = balance + 2000

Better still:

UPDATE myaccount SET balance = 2000 WHERE balance = 0 AND accountstatus = 1

Finally the application must check row count to make sure the expected number of rows were actually updated before presenting success feedback to the user.

Einstein
A: 

Use nolock as a last resort. Most deadlock problems can be fixed by tuning the queries and/or tuning the indexes. I think I've seen one deadlock in the last 5 years that couldn't be fixed by tuning one of the two.

Also note that NOLOCK is only honoured on select statements. Data modifications will always lock, that behaviour cannot be changed. So if you're got a writer/writer deadlock (quite common), no lock won't help at all.

Also be aware that nolock, in addition to returning dirty data can result in duplicate rows (rows read twice from the underlying table) and missing rows (rows in the underlying table that weren't read at all).

Nolock essentially means to SQL Server 'I don't mind if my results are slightly inaccurate'

Snapshot isolation is an option. Just make sure that you test carefully first as the increased load on TempDB can be quite severe, depending how frequent and long your transactions are. Also note that while you won't see deadlocks in snapshot isolation, you can get update conflicts. Again, test and make sure that your apps work properly and can handle any errors that they get.

GilaMonster