ansaurus

Question

MS SQL - High performance data inserting with stored procedures

Answer 1

+1 A:

One stored procedure so far as possible:

INSERT INTO MyTable(field1,field2)
SELECT "firstValue", "secondValue"
UNION ALL
SELECT "anotherFirstValue", "anotherSecondValue"
UNION ALL

If you aren't sure about how many items you're inserting you can construct the SQL query witin the sproc and then execute it. Here's a procedure I wrote to take a CSV list of groups and add their relationship to a user entity:

ALTER PROCEDURE [dbo].[UpdateUserADGroups]
@username varchar(100),
@groups varchar(5000)
AS
BEGIN
DECLARE @pos int,
@previous_pos int,
@value varchar(50),
@sql varchar(8000)

SET @pos = 1
SET @previous_pos = 0
SET @sql = 'INSERT INTO UserADGroups(UserID, RoleName)'

DECLARE @userID int
SET @userID = (SELECT TOP 1 UserID FROM Users WHERE Username = @username) 

WHILE @pos > 0
BEGIN
 SET @pos = CHARINDEX(',',@groups,@previous_pos+1)
 IF @pos > 0
 BEGIN
 SET @value = SUBSTRING(@groups,@previous_pos+1,@pos-@previous_pos-1)
 SET @sql = @sql + 'SELECT ' + cast(@userID as char(5)) + ',''' + @value + ''' UNION ALL '
SET @previous_pos = @pos
END
END

IF @previous_pos < LEN(@groups)
BEGIN
  SET @value = SUBSTRING(@groups,@previous_pos+1,LEN(@groups))
  SET @sql = @sql + 'SELECT ' + cast(@userID as char(5)) + ',''' + @value + ''''
END
print @sql
exec (@sql)

END

This is far faster than individual INSERTS.

Also, make sure you just a single clustered index on the primary key, more indexes will slow the INSERT down as they will need to update.

However, the more complex your dataset is, the less likely it is that you'll be able to do the above so you will simply have to make logical compromises. I actually end up calling the above routine around 8000 times.

David Neale 2010-05-21 09:18:20

Are you really sure, that building a query is faster than individual inserts(made benchmarks)? I think SQL server will not be able to use cached execution plans for on-the-fly generated querys. But i will give it a try.Any idea on the problem with ids? I think it would be more performant to generate all parent elements, and then all child elements. But how can set the parents id in the child elements foreign key?

Marks 2010-05-21 09:59:22

The only way to tell is to profile, I found that it made my entire routine much faster (can't remember the times). I don't know if it will cache the execution plan but it doesn't really need to as it'll be so basic. My example will only work will one level of nesting (one record with many basic children), maybe take a look at serialising and parsing your items from XML - http://weblogs.asp.net/jgalloway/archive/2007/02/16/passing-lists-to-sql-server-2005-with-xml-parameters.aspx

David Neale 2010-05-21 10:29:11

I did some profiling with a CreateUser SP of mine. Case 1: EXEC CreateUser @Name=..., ... * 1000 rowsCase 2: INSERT INTO Users (...) VALUES (...),(...),... * 1000 rowsCase 1: first time: 220ms , repeated: ~220msCase 2: first time: 2960ms, repeated: ~20msEven when not creating the INSERT INTO on-the-fly, its much slower for every first run with another number of users to insert becouse the cached execution plan is fix for the number of users submitted.I will look for a Xml solution, and hope to get better results. But thanks.

Marks 2010-05-21 11:32:25

That makes sense, I'll bear it in mind for future. Be sure to post an answer to this question if you find another solution, I'll be especially interested to hear if you do something around XML.

David Neale 2010-05-21 11:46:22

Is there any possibility to say how much execution plans can be stored in the cache? Or if they are deleted after a while? As the number of Items, that are inserted at once do not differ that much i am thinking of using the INSERT INTO A (...) VALUES (...)(...) solution and injecting the execution plans for common numbers at installation time. It would need approximately 100 different executions. I think, thats something the cache should be able to handle, but i dont find any information on this.

Marks 2010-05-21 13:47:41

I'm not sure to be honest. A cache is not designed for persistant data storage and I don't think there's anyway to manually add query plans to it.

David Neale 2010-05-21 14:09:53

But you can shoot the querys with dump data with different numbers if items, so the exectuion plan will be cached. But i think there is something like a cache clear if the plan is not used for a while.

Marks 2010-05-21 14:41:15

ansaurus

tags:

views:

answers:

MS SQL - High performance data inserting with stored procedures

related questions