views:

1077

answers:

3

We are trying to have a transactional table that only takes new records inserted on a regular basis.

This simple table requires us to continuously add new records to it over time. The volume of transactions into this table is expected to be quite high, and also there might be periodical batch imports of transactions (>1000) that may take multiple seconds to complete.

From this data we then do a set of select statements grouping different columns to return the required values.

From our initial testing we have found a bottleneck to be related to SQL Server that blocks our SELECT's when in the middle of a transaction of INSERTS.

Below is a simple example that can be run to illustrate the problem.

-- Simple DB Table

create table LOCK_TEST (
LOCK_TEST_ID int identity ,
AMOUNT int);

-- Run this in 1 query window

begin tran
insert into LOCK_TEST (AMOUNT) values (1);
WAITFOR DELAY '00:00:15' ---- 15 Second Delay
insert into LOCK_TEST (AMOUNT) values (1);
commit

-- In Query 2 run this in parallel

select SUM(AMOUNT)
from LOCK_TEST;

I would expect Query 2 to return straight away, with 0 until query 1 completes, and then show 2. We never want to see 1 returned from the 2nd query.

The answer's we have looked at relate to WITH (NOLOCK) on the select statement. But this violates the transactional boundaries, and the returned information may be financial in nature and we don't wish to see any uncommited details in our queries.

My problem seems to be on the INSERT side...
Why does the INSERT block the SELECT statement even though it's not modifying any existing data?

Bonus points question: Is this a "feature" of SQL Server, or would we find this on other Database flavours also?

UPDATE I have now had time to find a local oracle database and run the same simple test. This test pass's as I would expect.

Ie I can run query as often as I want, and it will return null until the 1st transaction commits, then returns 2.

Is there a way to make SQL Server work like this? or do we need to move to Oracle?

+2  A: 

This is completely standard behaviour in SQL Server and Sybase (at least).

Data changes (Insert, update, delete) require exclusive locks. this causes readers to be blocked:

With an exclusive (X) lock, no other transactions can modify data; read operations can take place only with the use of the NOLOCK hint or read uncommitted isolation level.

With SQL Server 2005 and above, they introduced snapshot isolation (aka row versioning). From MS BOL: Understanding Row Versioning-Based Isolation Levels. This means a reader has latest committed data but if you then want to write back, say, then you may find the data is wrong because it changed in the blocking transaction.

Now, this is why best practice is to keep transactions short. For example, only process what you need to between BEGIN/COMMIT, don't send emails from triggers etc.

Edit:

Locking like this happens all the time in SQL Server. However, locking for too long becomes blocking that reduce performance. Too long is subjective, obviously.

gbn
The transaction is short, but the volume of them cause blocking that shouldn't be required at all. See comment on question about Oracle.
stevemac
Move to Oracle if you want. If you can move to Oracle simply then I suspect your SQL is not optimal, you're round-tripping from client to server, or something that is causing too much blocking.
gbn
How can such a simple example be not optimal? From the simple example provided it still blocks. Can you suggest how I change the provided example to make it optimal? Please note that there will NEVER be updates to this data, only new inserts.
stevemac
Also, you can use snapshot isolation in SQL Server is you want. This mimics Oracle behaviour.
gbn
Thanks will look into it. The concept of "snapshot" in a database scares me a bit. But I'll see what it has to offer.
stevemac
Your SUM needs to scan the whole table. It can't, because of the block. In a larger example, you'd probably filter to get a SUM which will not require reading of locked rows. That's how we do it over a table that has around 5 million rows per day added.
gbn
+1  A: 

This exists in MySQL too. Here's the logic behind it: if you perform a SELECT on some data, you expect it to operate on an up-to-date dataset. That's why INSERT results in a table-level lock (that is unless the engine is able to perform row-level locking, like innodb can). There are ways to disable this behaviour, so that you can do "dirty reads", but they all are software (or even database-engine) specific.

shylent
I agree, but I also want it to acknowledge transactions. From my point of view the new records (those in the transaction) don't exist, and possibly never will. Why the database blocks on data that (from my point of view) doesn't exist seems wrong.I understand that if I was updating data in one place and SELECTing it from elsewhere I might want to wait for the update, but I want control of that... Especially with INSERTING.It seems that Oracle does as I expect. Still trying to work out if SQL Server can.
stevemac
+3  A: 

this locking behavior is a feature of SQL Server. With 2005 and above, you can use row level versioning (which is what is used by default on Oracle) to achieve the same result & not block your selects. This put extra strain in tempdb because tempdb maintains the row level versioning, so make sure you accommodate for this. To make SQL behave the way you want it to, run this:

ALTER DATABASE MyDatabase
SET ALLOW_SNAPSHOT_ISOLATION ON

ALTER DATABASE MyDatabase
SET READ_COMMITTED_SNAPSHOT ON
Nick Kavadias