how do i do very fast inserts to SQL Server 2008

views:

171

answers:

+2 Q:

how do i do very fast inserts to SQL Server 2008

I have a project that involves recording data from a device directly into a sql table.

I do very little processing in code before writing to sql server (2008 express by the way)

typically i use the sqlhelper class's ExecuteNonQuery method and pass in a stored proc name and list of parameters that the SP expects.

This is very convenient, but i need a much faster way of doing this.

Thanks.

+1 A:

bulk insert would be the fastest since it is minimally logged

.NET also has the SqlBulkCopy Class

SQLMenace 2010-05-18 23:44:13

I have to log each output one at a time, i cannot collect several outputs and log them in bulk

CharlesO 2010-05-19 00:13:10

+1 A:

This is typically done by way of a BULK INSERT. Basically, you prepare a file and then issue the BULK INSERT statement and SQL Server copies all the data from the file to the table with the fast method possible.

It does have some restrictions (for example, there's no way to do "update or insert" type of behaviour if you have possibly-existing rows to update), but if you can get around those, then you're unlikely to find anything much faster.

Dean Harding 2010-05-18 23:44:56

i have to log each device output individually, that excludes the bulk options

CharlesO 2010-05-18 23:50:45

What do you mean by "log each device output individually"?

Dean Harding 2010-05-18 23:54:13

what I mean is I have to log each output one at a time, i cannot collect several outputs and log them in bulk

CharlesO 2010-05-19 00:12:48

If you mean from .NET then use SqlBulkCopy

Mitch Wheat 2010-05-18 23:45:49

i have to log each device output individually, that excludes the bulk options

CharlesO 2010-05-18 23:51:02

@CharlesO: I have no idea what that means. Bulk has no relation to many devices!

Mitch Wheat 2010-05-18 23:55:08

what I mean is I have to log each output one at a time, i cannot collect several outputs and log them in bulk

CharlesO 2010-05-19 00:12:31

+1 A:

Things that can slow inserts include indexes and reads or updates (locks) on the same table. You can speed up situations like yours by avoiding both and inserting individual transactions to a separate holding table with no indexes or other activity. Then batch the holding table to the main table a little less frequently.

Joel Coehoorn 2010-05-18 23:55:06

indexes can also help to speed INSERTS! Like anything with a DB, it depends...

Mitch Wheat 2010-05-19 00:07:04

@Mitch: This is news to me, can you give us an example?

Aaronaught 2010-05-19 00:19:41

@Joel:Your suggestion makes sense. Basically I hold open an ADO.net connection, and keep using sqlcommand objects to insert records as fast as the device produces then.However:1) Will raw sql inserts be a better option than using a stored proc in this case?2) Will this holding table need to have a clustered index, like an auto incrementing ID col3)say i run an sqlAgent script ever 2 seconds to batch the holding table to the main table, won't the deletes on the holding table after copying each batch further degrade performance?Thanks

CharlesO 2010-05-19 00:29:13

@Mitch: This is news to me as well...

CharlesO 2010-05-19 00:31:20

@Aaronaught - by helping other queries run faster and therefore reducing locking on the table.

Joel Coehoorn 2010-05-19 00:42:47

Indexes might speed up the checking of foreign key constraints.

meriton 2010-05-19 01:52:49

It can only really go as fast as your SP will run. Ensure that the table(s) are properly indexed and if you have a clustered index, ensure that it has a narrow, unique, increasing key. Ensure that the remaining indexes and constraints (if any) do not have a lot of overhead.

You shouldn't see much overhead in the ADO.NET layer (I wouldn't necessarily use any other .NET library above SQLCommand). You may be able to use ADO.NET Async methods in order to queue several calls to the stored proc without blocking a single thread in your application (this potentially could free up more throughput than anything else - just like having multiple machines inserting into the database).

Other than that, you really need to tell us more about your requirements.

Cade Roux 2010-05-19 00:01:45

@Cade - solid suggestion, stick to pure ADO.net only, and possibly hold my connection open as long as possible. Building the parameters for the SP and then calling the SP and passing parameters, it seems tedious for a simple insert operation. Would you recommend sending insert statements directly from ADO.net?

CharlesO 2010-05-19 00:37:41

@CharlesO - I would use SQLCommand, add the parameters and ExecuteNonQuery (I would not send the 'INSERT INTO blah VALUES (blah') literal string) - and I would definitely think about using the async version: BeginExecuteNonQuery (http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlcommand.beginexecutenonquery.aspx) with some sort of governor on how many active inserts and a queue internal to the program. You can then see if it can handle your expected workload or if you need more extensive storing and forwarding.

Cade Roux 2010-05-19 02:20:22

+7 A:

ExecuteNonQuery with an INSERT statement, or even a stored procedure, will get you into thousands of inserts per second range on Express. 4000-5000/sec are easily achievable, I know this for a fact.

What usually slows down individual updates is the wait time for log flush and you need to account for that. The easiest solution is to simply batch commit. Eg. commit every 1000 inserts, or every second. This will fill up the log pages and will amortize the cost ow log flush wait over all the inserts in a transaction.

With batch commits you'll probably bottleneck on disk log write performance, which there is nothing you can do about it short of changing the hardware (going raid 0 stripe on log).

If you hit earlier bottlenecks (unlikely) then you can look into batching statements, ie. send one single T-SQL batch with multiple inserts on it. But this seldom pays off.

Of course, you'll need to reduce the size of your writes to a minimum, meaning reduce the width of your table to the minimally needed columns, eliminate non-clustered indexes, eliminate unneeded constraints. If possible, use a Heap instead of a clustered index, since Heap inserts are significantly faster than clustered index ones.

There is little need to use the fast insert interface (ie. SqlBulkCopy). Using ordinary INSERTS and ExecuteNoQuery on batch commits you'll exhaust the drive sequential write throughput much faster than the need to deploy bulk insert. Bulk insert is needed on fast SAN connected machines, and you mention Express so it's probably not the case. There is a perception of the contrary out there, but is simply because people don't realize that bulk insert gives them batch commit, and its the batch commit that speeds thinks up, not the bulk insert.

As with any performance test, make sure you eliminate randomness, and preallocate the database and the log, you don't want to hit db or log growth event during test measurements or during production, that is sooo amateurish.

Remus Rusanu 2010-05-19 00:12:39

lol @ "that is sooo amateurish...."@Ramus - Thanks man, you totally nailed it.Please clarify on "ExecuteNonQuery".Do you mean ExecuteNonQuery method on the sqlhelper class in Microsoft.ApplicationBlocks.Data?

CharlesO 2010-05-19 00:47:23

I mean the basic SqlCommand http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlcommand.executenonquery.aspx, but I think the Application Framework one is very similar in every aspect. One think you gotta make absolutely sure is that Application Framework does *not* enroll your connection into a distributed transaction, that would just sunk any performance.

Remus Rusanu 2010-05-19 01:27:04

thanks. i'll use the basic sqlcommand

CharlesO 2010-05-19 12:19:00

ansaurus

tags:

views:

answers:

how do i do very fast inserts to SQL Server 2008

related questions