ansaurus

Question

In a Data Warehouse scenario is there any disadvantage to using WITH(NOLOCK)

Answer 1

+1 A:

As long as it's all no-update data there's no harm, but I'd be surprised if there's much benefit either. I'd say it's worth a try. The worst that will happen is that you'll get incomplete and/or inconsistent data if you are in the middle of a batch insert, but you can decide if that invalidates anything useful.

le dorfier 2009-06-18 20:24:55

The fact rows we are reading will not change and the dimension rows will always be valid, but they may be expired and a new dimension creted for new facts.

Cade Roux 2009-06-18 22:58:37

Seems entirely straightfoward to me. I have only two questions. 1. Is there a problem with the way its running without making this change (i.e. is this possibly a premature optimization). 2. These are all read-only queries and you're relaxing their isolation levels. What bad thing (other than the obvious crappiness of the resuits, which you are evidently mitigating by your emphasis on appends and fact versioning) are you imagining?

le dorfier 2009-06-18 23:11:53

I'm not in control of the ETL, but I am responsible for all the reporting. I am not given access to sp_who, so I need to proactively make sure all my (significant) processing does not interfere with daily and monthly loads before the DBAs complain I'm blocking them.

Cade Roux 2009-06-19 03:34:23

Then I think the simple answer to your specific question is that using WITH(NOLOCK) will decrease and likely eliminate the likelihood that you will create any locks which would cause the ETL processes to fail or block; and there are no disadvantages created by this strategy. You have a bunch of suggestions for more exotic ways to accomplish approximately the same thing, but they generally involve global server setting alterations that you probably don't have access to and you dbas will be suspicious about.

le dorfier 2009-06-19 04:16:14

Answer 2

+1 A:

Yes. Your SQL will be far less readable. You will inevitably miss some NOLOCK hints because SQL SELECT commands using the NOLOCK strategy have to put it all over the place.

You can get the same thing by setting the isolation level

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

In the end you get a 10% performance boost (sorry I'm too lazy too look up the article for it, but it's out there)

I'd say a 10% gain isn't worth reducing readability.

MatthewMartin 2009-06-18 23:11:23

+1 for setting isolation level but "with nolock" is better than nothing. It's not a 10% thing as much as will it run or not thing.

jms 2009-06-18 23:16:20

I will consider it. There are other operations on the connection (i.e. my process configuration and status data, not the fact/dim star-modelled data) which I do not necessarily want to have behave that way. At this point I have the WITH(NOLOCK) centralized into the views I use which wrap and flatten the star models in the DW database.

Cade Roux 2009-06-19 16:13:58

Answer 3

A:

If making the whole database read-only is possbile, Then this is a better option. You'll get read-uncommitted performance without having to modify all your code.

ALTER DATABASE adventureworks SET read_only

Nick Kavadias 2009-06-19 00:03:03

The DW database is already read-only to my user, but has to be writable to the ETL processes loading new data into the facts (and dimensions as necessary). My database holds my process procedures and configuration.

Cade Roux 2009-06-19 16:13:00

Nick Kavadias 2009-06-21 23:50:32

Maybe a read-only file group?

Nick Kavadias 2009-06-21 23:51:16

Answer 4

+2 A:

This is what you probably need:

`ALTER DATABASE AdventureWorks SET READ_COMMITTED_SNAPSHOT ON;

ALTER DATABASE AdventureWorks SET ALLOW_SNAPSHOT_ISOLATION ON; `

Then go ahead and use

SET TRANSACTION ISOLATION LEVEL READ COMMITTED

in your queries. According to BOL:

The behavior of READ COMMITTED depends on the setting of the READ_COMMITTED_SNAPSHOT database option:

If READ_COMMITTED_SNAPSHOT is set to OFF (the default), the Database Engine uses shared locks to prevent other transactions from modifying rows while the current transaction is running a read operation. The shared locks also block the statement from reading rows modified by other transactions until the other transaction is completed. The shared lock type determines when it will be released. Row locks are released before the next row is processed. Page locks are released when the next page is read, and table locks are released when the statement finishes.

If READ_COMMITTED_SNAPSHOT is set to ON, the Database Engine uses row versioning to present each statement with a transactionally consistent snapshot of the data as it existed at the start of the statement. Locks are not used to protect the data from updates by other transactions.

Hope this help. Raj

Raj 2009-06-19 02:58:23

I will consider it. There are other operations on the connection (i.e. my process configuration and status data, not the fact/dim star-modelled data) which I do not necessarily want to have behave that way.

Cade Roux 2009-06-19 16:11:51

Answer 5

A:

Have you considered creating a DATABASE SNAPSHOT of your DW and run your reports off it?

Remus Rusanu 2009-06-19 10:06:09

No, that's not really possible, since we're talking about several TBs of data. The DW is designed for this purpose, which is why the fact tables are partitioned by date.

Cade Roux 2009-06-19 16:10:20

But a database snapshot is a sparse file with copy-on-write semantics. All you need is the space on disk to be reserved, actual I/O operations will occur only when a write occurs on the original database.

Remus Rusanu 2009-06-23 17:58:54

Answer 6

A:

NOLOCK performs a ‘dirty read’ (indecently READ UNCOMMITTED does the same thing as NOLOCK). If the database is being updated as you read there is a danger that you will get inconsistent data back. The only option is to either accept locking and hence blocking, or to pick one of the two new isolation levels offered in SQL 2005 onwards discussed here.

Ben Breen 2009-07-16 17:10:09

There are no inserts or updates to the data possible. The only changes will be to future dates which we don't read until the processing is complete.

Cade Roux 2009-07-16 21:11:16

Then NOLOCK is the correct solution.

Ben Breen 2009-07-17 10:29:54

ansaurus

tags:

views:

answers:

In a Data Warehouse scenario is there any disadvantage to using WITH(NOLOCK)

related questions