What is the best practice for implementing a data table with many inserts, deletes and reads?

views:

181

answers:

What is the best practice for implementing a data table with many inserts, deletes and reads?

I am creating a functionality where our wcf services logs all changes that are stored thru them and the changes need to be sent to other systems.

After every service call with changes we store the changes in a table (the changes is serialized). Regulary we have biztalk to pull the changes from the table and delete the one that is pulled.

This means that during high load the amount of inserts and delete are high, and we struggle with that the select times out because it is blocked by the inserts.

I have tried to play with different Isolation levels, but have not found anything that works yet.

We use ado.net and sql server 2005 for this.

What is best practice for implementing a data table with many inserts, deletes and read, when using sql server 2005 and ado.net.

Edited: Our problem in our today solution is that all the continues inserts are stopping all reads from the table. Probably because if a clustred index scan that I see no good way to remove for the moment.

This is something that your DBA should be doing. Are you the DBA? If so then you'll need to look at running some performance tests to see what the bottleneck is. This article should help as an introduction to performance tuning: http://www.devx.com/getHelpOn/Article/8202/

Jonathan Parker 2009-05-13 07:48:24

It is not all companies that have dedicated DBAs,but I know how to use the profiler, DBCC INPUTBUFFER and other methods to get info about the sql performance.And this is not only a case for hte DB, since it also involves the code that are doing the queries.

Atle 2009-05-13 08:10:42

I would definitely run profiler on it and see which queries are slow. Also remember that if one query is faster than another but is run 100x or 1000x more then optimizing it by 1.5x will be more useful overall than optimizing the less frequently run query by 2x.

Jonathan Parker 2009-05-13 08:13:25

The problem is that the inserts add locks that avoids the select from working. When addin nolock in the select it works, but then I get bad data, and my app can not tolerate that.

Atle 2009-05-13 08:21:57

So you need to run profiler to see what indexes you can put on the table to reduce the table scan (and thus the locking) of the select.

Jonathan Parker 2009-05-13 23:20:58

If your application's semantics allow, consider doing inserts into a separate "auxiliary table" and periodically applying them to the "master table" in a single large transaction; deletes can be handled similarly (where you'd now do a delete, do instead an insert into a separate auxiliary table of a record identifying the master table record to delete, and periodically perform a bunch of deletes from the master table in a single big transaction).

Of course, if you do that the master table won't reflect "the very latest state" but "the state as of a few minutes [or whatever time units] ago", which is why I say "if semantics allow". Sometimes it's OK to have many SELECTs rely on "state as of last update" (i.e., what's in the master table) and a few SELECTs that really need to reflect instantly the latest updates can be satisfied (maybe a bit slower) with more complex queries using both the master and auxiliary tables (via UNION, joins or whatever, depending on the details of the semantics you need to implement).

Alex Martelli 2009-05-13 08:22:33

The idea with "auxiliary table" looks good, but I am afraid that I just move the problem. The same problem will probably occur when doing the select insert from the master table, since the inserts are blocking the read.All queries against the table only uses taht table. No joins, unions or whatever.

Atle 2009-05-13 08:29:21

You mention that you've tried different isolation levels, but have you tried snapshot isolation?

Snapshot isolation basically makes a copy of the row before an insert/update, and any query that attempts to select that row gets the old copy until the insert is completed. This means that you'll occasionally end up with slightly older data, but selects won't block while inserts are happening.

The downside of this is that your inserts/updates will take longer, due to the need to take a copy. I'm not sure how much of a penalty it incurs though.

Dan Fuller 2009-05-13 08:24:42

I have been thinking about this solution. The way the app works mean that getting old data is no problem. What worries me is the performance hit. Any idea about how much that could be?I have no updates in the table only insert, read and delete.

Atle 2009-05-13 08:31:46

No idea how much the performance hit would be, however it'd be easy to write a quick program to perform 1,000,000 inserts with and without snapshot enabled and see how it handles.Another possible solution is to use the NOLOCK query hint in your select statement. This causes the select to completely ignore any locks on rows in the database and read them anyway. Main drawback to this is that as well as being able to get old data, you can also get incorrect data if a rollback occurs.

Dan Fuller 2009-05-13 08:43:29

+1 A:

On the DB side:

Try using the locking hint called READPAST which skips rows that are locked by other transactions. See MSDN for details.
Sql Server sometimes escalates locks from row level to page or table level if it thinks that it's more economical. You can force keeping the locks at the row level with the hint ROWLOCK.
Drop any unnecessary indexes because they slow down inserts, updates and deletes and also can cause locking problems on the index data.

On the C# code side:

Use the best practice of 'acquire resources as late as possible and release them as early as possible'. Open your ADO.NET connection, run your sql command and close your connection before doing any time consuming operations on the results. You don't have to worry about connection pooling, it is done automatically as long as you use the same connection string all the time.

Vizu 2009-05-13 08:34:36

I will try the readpast, and it sound like it will work, since the rows I want always wil be first and not locked(at least after they are committed).Would readpast also return no commited rows?

Atle 2009-05-13 08:48:25

READPAST wont return dirty/uncommited data. Simply skips the locked rows. This hint is especially for 'queue-like' tables like yours.

Vizu 2009-05-13 08:51:42

Some first tests indicates that READPAST is just what I needed. Thx!I will test little bit more and report the result.

Atle 2009-05-13 09:21:01

This will probably seem extreme, but have you considered temporarily disabling the index causing that clustered table scan at peak times? You could then rebuild the index at a quite part of the day?

Articles on disabling indexes

TechNet Guidelines for Disabling Indexes

Disable Indexes

kevchadders 2009-05-13 08:55:51

I have minimum of indexes on the table, only on the primary key and one other index thatcovers only one int field.The read and inserts goes on continues all the times, so a disabling would not help much.

Atle 2009-05-13 08:59:59

ansaurus

tags:

views:

answers:

What is the best practice for implementing a data table with many inserts, deletes and reads?

related questions