views:

361

answers:

10

I have a table in a SQL Server 2005 Database that is used a lot. It has our product on hand availability information. We get updates every hour from our warehouse and for the past few years we've been running a routine that truncates the table and updates the information. This only takes a few seconds and has not been a problem, until now. We have a lot more people using our systems that query this information now, and as a result we're seeing a lot of timeouts due to blocking processes.

... so ...

We researched our options and have come up with an idea to mitigate the problem.

  1. We would have two tables. Table A (active) and table B (inactive).
  2. We would create a view that points to the active table (table A).
  3. All things needing this tables information (4 objects) would now have to go through the view.
  4. The hourly routine would truncate the inactive table, update it with the latest information then update the view to point at the inactive table, making it the active one.
  5. This routine would determine which table is active and basically switch the view between them.

What's wrong with this? Will switching the view mid query cause problems? Can this work?

Thank you for your expertise.

Extra Information

  • the routine is a SSIS package that peforms many steps and eventually truncates/updates the table in question

  • The blocking processes are two other stored procedures that query this table.

+2  A: 

I think this is going about it the wrong way - updating a table has to lock it, although you can limit that locking to per page or even per row.

I'd look at not truncating the table and refilling it. That's always going to interfere with users trying to read it.

If you did update rather than replace the table you could control this the other way - the reading users shouldn't block the table and may be able to get away with optimistic reads.

Try adding the with(nolock) hint to the reading SQL View statement. You should be able to get very large volumes of users reading even with the table being regularly updated.

Keith
It's my understanding that SQl Hints, especially NOLOCK are bad for many reasons. http://tinyurl.com/qwloxh
Ryan Montgomery
They're nowhere near as 'bad' as creating copies of the data and rebuilding views in order to avoid locking. The best way is to set the locking is actually in the transaction properties, but that isn't possible here because you're using truncate. Also I disagree with the points made in that article - sometimes SQL hints are the best way to deal with deadlocks, but you have to understand what you're doing with them.
Keith
Oh, and SO does it: http://www.codinghorror.com/blog/archives/001166.html
Keith
NOLOCK is perfect for some situations, though. You just have to be aware of what it is doing.
David
nolock in this scenario would mean that sometimes (whenever the ssis package runs) stuff would go completely nuts. queries would return 0 results (after the truncate) queries would return the wrong data (mid time during the big transaction)
Sam Saffron
'completely nuts'? Is that a technical term? ;-) Yeah - nolock would result in dirty reads, but guaranteed no locking. The only way to guarantee a clean read is to lock the table, which is what's causing the problem here. For most applications the occasional dirty read really isn't an issue so nolock can be a really useful tool. You wouldn't use it in a financial or data-critical application, but then you should never use truncate or have an un-transacted operations in such an application either.
Keith
+1  A: 

Why not use transactions to update the information rather than a truncate operation.

Truncate is non logged so it cannot be done in a transaction.

If you're operation is done in a transaction then existing users will not be affected.

How this is done would depend on things like the size of the table and how radically the data changes. If you give more detail perhaps I could advise further.

Simon
The Routine is a SSIS package and it appears that the whole thing is running in a transaction, which is what causes the blocking.
Ryan Montgomery
+2  A: 

Personally, if you are always going to be introducing down time to run a batch process against the table, I think you should manage the user experience at the business/data access layer. Introduce a table management object that monitors connections to that table and controls the batch processing.

When new batch data is ready, the management object stops all new query requests (maybe even queueing?), allows the existing queries to complete, runs the batch, then reopens the table for queries. The management object can raise an event (BatchProcessingEvent) that the UI layer can interpret to let people know that the table is currently unavailable.

My $0.02,

Nate

Nathan Southerland
We've considered this and we really like this option, but it requires a lot more work to get going, and time is a luxury we don't have unfortunately. But I agree handling this regardless of the server batch process is optimal.
Ryan Montgomery
A: 

We do this on our high usage systems and haven't had any problems. However, as with all things database, the only way to be sure it would help would be to make the change in dev and then load test it. Not knowing waht else your SSIS package does, it may still cause blocks.

HLGEM
+1  A: 

One possible solution would be to minimize the time needed to update the table.

I would first Create a staging table to download the data from the warehouse.

If you have to do "inserts, updates and deletes" in the final table

Lets suppose the finale table looks like this:

Table Products:
    ProductId       int
    QuantityOnHand  Int

And you need to update QuantityOnHand from the warehouse.

First Create a Staging table like:

Table Prodcuts_WareHouse
    ProductId       int
    QuantityOnHand  Int

And then Create an "Actions" Table like this:

Table Prodcuts_Actions
    ProductId       int
    QuantityOnHand  Int
    Action          Char(1)

The update process should then be something like this:

1.Truncate table Prodcuts_WareHouse

2.Truncate table Prodcuts_Actions

3.Fill the Prodcuts_WareHouse table with the data from the warehouse

4.Fill the Prodcuts_Actions table with this:

Inserts:

INSERT INTO Prodcuts_Actions (ProductId, QuantityOnHand,Action)
SELECT     SRC.ProductId, SRC.QuantityOnHand, 'I' AS ACTION
FROM         Prodcuts_WareHouse AS SRC LEFT OUTER JOIN
                      Products AS DEST ON SRC.ProductId = DEST.ProductId
WHERE     (DEST.ProductId IS NULL)

Deletes

INSERT INTO Prodcuts_Actions (ProductId, QuantityOnHand,Action)
SELECT     DEST.ProductId, DEST.QuantityOnHand, 'D' AS Action
FROM         Prodcuts_WareHouse AS SRC RIGHT OUTER JOIN
                      Products AS DEST ON SRC.ProductId = DEST.ProductId
WHERE     (SRC.ProductId IS NULL)

Updates

INSERT INTO Prodcuts_Actions (ProductId, QuantityOnHand,Action)
SELECT     SRC.ProductId, SRC.QuantityOnHand, 'U' AS Action
FROM         Prodcuts_WareHouse AS SRC INNER JOIN
                      Products AS DEST ON SRC.ProductId = DEST.ProductId AND SRC.QuantityOnHand <> DEST.QuantityOnHand

Until now you haven't locked the final table.

5.In a transaction update the final table:

BEGIN TRANS

DELETE Products FROM Products INNER JOIN
Prodcuts_Actions ON Products.ProductId = Prodcuts_Actions.ProductId
WHERE     (Prodcuts_Actions.Action = 'D')

INSERT INTO Prodcuts (ProductId, QuantityOnHand)
SELECT ProductId, QuantityOnHand FROM Prodcuts_Actions WHERE Action ='I';

UPDATE Products SET QuantityOnHand = SRC.QuantityOnHand 
FROM         Products INNER JOIN
Prodcuts_Actions AS SRC ON Products.ProductId = SRC.ProductId
WHERE     (SRC.Action = 'U')

COMMIT TRAN

With all the process above, you minimize the amount of records to be updated to the minimum necessary, and so the time the final table will be locked while updating.

You can even don't use a transaction in the final step, so between command the table will be released.

Burnsys
A: 

If you have the Enterprise Edition of SQL Server at your disposal then may I suggest that you use SQL Server Partitioning technology.

You could have your currently required data reside within the 'Live' partition and the updated version of the data in the 'Secondary' partition (which is not available for querying but rather for administering data).

Once the data has been imported into the 'Secondary' parition you can instantly SWITCH the 'LIVE' partition OUT and the 'Secondary' partition IN, thereby incurring zero downtime and no blocking.

Once you have made the switch, you can go about truncating the no longer needed data without adversley affecting users of the newly live data (previously the Secondary partition).

Each time you need to do an import job, you simply repeat/reverse the process.

To learn more about SQL Server Partitioning see:

http://msdn.microsoft.com/en-us/library/ms345146(SQL.90).aspx

Or you can just ask me :-)

EDIT:

On a side note, in order to address any blocking issues, you could use SQL Server Row Versioning technology.

http://msdn.microsoft.com/en-us/library/ms345124(SQL.90).aspx

John Sansom
This is the idea of what we wanted to accomplish with a view, but we only have the standard edition of SQL Server.
Ryan Montgomery
+2  A: 
Burnsys
+4  A: 

Have you considered using snapshot isolation. It would allow you to begin a big fat transaction for your SSIS stuff and still read from the table.

This solution seems much cleaner than switching the tables.

Sam Saffron
+1 on this. It's exactly what you should be doing. Ideally, you'd get updates that didn't require dumping everything, but since you don't, the trick is to start a transaction, delete (not truncate) everything, then load everything. The instant that you run commit, the database will make everything change over with no gap.
Autocracy
Why not truncate? Wouldn't it be slower to delete since every delete is logged?
Ryan Montgomery
I think that is the point. by using snapshot isolation, its the commit that causes users to see the new data. truncate isn't logged and I don't know how it would affect the transacton but probably not how you would want. In fact I bet the truncate would fail if the table had a pending transaction.
Zack
We are testing what setting Read_Committed_Snapshot does to our situation. So far so good.
Ryan Montgomery
A: 

If the table is not very large you could cache the data in your application for a short time. It may not eliminate blocking altogether, but it would reduce the chances that the table would be queried when an update occurs.

TGnat
A: 

Hi,

Perhaps it would make sense to do some analysis of the processes which are blocking since they seem to be the part of your landscape which has changed. It only takes one poorly written query to create the blocks which your are seeing. Barring a poorly written query, maybe the table needs one or more covering indexes to speed up those queries and get you back on your way without having to re-engineer your already working code.

Hope this helps,

Bill

Bill Mueller