views:

567

answers:

7

I have an insert query that gets generated like this

INSERT INTO InvoiceDetail (LegacyId,InvoiceId,DetailTypeId,Fee,FeeTax,Investigatorid,SalespersonId,CreateDate,CreatedById,IsChargeBack,Expense,RepoAgentId,PayeeName,ExpensePaymentId,AdjustDetailId) 
VALUES(1,1,2,1500.0000,0.0000,163,1002,'11/30/2001 12:00:00 AM',1116,0,550.0000,850,NULL,@ExpensePay1,NULL); 
DECLARE @InvDetail1 INT; SET @InvDetail1 = (SELECT @@IDENTITY);

This query is generated for only 110K rows.

It takes 30 minutes for all of these query's to execute

I checked the query plan and the largest % nodes are

A Clustered Index Insert at 57% query cost which has a long xml that I don't want to post.

A Table Spool which is 38% query cost

<RelOp AvgRowSize="35" EstimateCPU="5.01038E-05" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="1" LogicalOp="Eager Spool" NodeId="80" Parallel="false" PhysicalOp="Table Spool" EstimatedTotalSubtreeCost="0.0466109">
  <OutputList>
    <ColumnReference Database="[SkipPro]" Schema="[dbo]" Table="[InvoiceDetail]" Column="InvoiceId" />
    <ColumnReference Database="[SkipPro]" Schema="[dbo]" Table="[InvoiceDetail]" Column="InvestigatorId" />
    <ColumnReference Column="Expr1054" />
    <ColumnReference Column="Expr1055" />
  </OutputList>
  <Spool PrimaryNodeId="3" />
</RelOp>

So my question is what is there that I can do to improve the speed of this thing? I already run ALTER TABLE TABLENAME NOCHECK CONSTRAINTS ALL Before the queries and then ALTER TABLE TABLENAME NOCHECK CONSTRAINTS ALL after the queries.

And that didn't shave off hardly anything off of the time.

Know I am running these queries in a .NET application that uses a SqlCommand object to send the query.

I then tried to output the sql commands to a file and then execute it using sqlcmd, but I wasn't getting any updates on how it was doing, so I gave up on that.

Any ideas or hints or help?

UPDATE:

Ok so all of you were very helpful. In this situation I wish I could give credit to more than one answer.

The solution to fix this was twofold.

The first:

1) I disabled/reenabled all the foreign keys(much easier than dropping them)

ALTER TABLE TableName NOCHECK CONTRAINT ALL
ALTER TABLE TableName CHECK CONTRAINT ALL

2) I disabled/Reenabled the indexes (again much easier than dropping)

ALTER INDEX [IX_InvoiceDetail_1] ON [dbo].[InvoiceDetail] DISABLE
ALTER INDEX [IX_InvoiceDetail_1] ON [dbo].[InvoiceDetail] REBUILD PARTITION = ALL WITH ( PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, ONLINE = OFF, SORT_IN_TEMPDB = OFF )

The second:

I wrapped all of the insert statements into one transaction. I initially didn't know how to do that in .NET.

I really appreciate all of the input I got.

If I ever do this kind of translation from DB to DB I will definitely start with BULK INSERT. It seems much more flexible and faster.

+4  A: 

You have tagged this question as "bulkinsert". So why not use the BULK INSERT command?

If you want progress updates you can split the bulk insert into smaller pieces and update the progress after each piece completes.

Mark Byers
Good other suggestion.
Jaxidian
Does Bulk insert allow you to enter identity columns? I must admit i don't know too much about bulk insert, i am researching it though.
Jose
@Jose: I think this article answers your question: http://msdn.microsoft.com/en-us/library/ms186335.aspx
Mark Byers
+7  A: 

Sounds like the inserts are causing SQL Server to recalculate the indexes. One possible solution would be to drop the index, perform the insert, and re-add the index. With your attempted solution, even if you tell it to ignore constraints, it will still need to keep the index updated.

Jaxidian
+2  A: 

There are a several things you can do:

1) Disable any triggers on this table
2) Drop all indexes
3) Drop all foreign keys
4) Disable any check constraints
Randy Minder
A: 

Hm, let it run, check performance counters. what do you see? What disc layout do you have? I can insert some million rows in 30 minutes - nearly a hundred million rows, to be exact (real time financial information, linkes to 3 other tables). I pretty much bet that your IO layout is bad (i.e. bad disc structure, bad file distribution)

TomTom
+5  A: 

Are you executing these queries one at a time from a .Net client (i.e. sending 110,000 separate query requests to SQL Server)?

In that case, it's likely that it's the network latency and other overhead of sending these INSERTs to the SQL Server without batching them, not SQL Server itself.

Check out BULK INSERT.

Patrick
+2  A: 

Running individual INSERTs is always going to be the slowest option. Also - what's the deal with the @@IDENTITY - doesn't look like you ned to keep track of those in between.

If you don't want to use BULK INSERT from file or SSIS, there is a SqlBulkCopy feature in ADO.NET which would probably be your best bet if you absolutely have to do this from within a .NET program.

110k rows should take less time to import than me reseaching and writing this answer.

Cade Roux
+1 for SqlBulkCopy - this is also a good suggestion.
Mark Byers
+2  A: 

Most likely this is commit flush wait. If you don't wrap sets of INSERTs into explicitly managed transaction then each INSERT is its own auto-committed transaction. Meaning each INSERT issues automatically a commit, and a commit has to wait until the log is durable (ie. written to disk). Flushing the log after each insert is extremely slow.

For instance, trying to insert 100k rows like yours on a single row commit style:

set nocount on; 
declare @start datetime = getutcdate();  

declare @i int = 0;
while @i < 100000
begin
INSERT INTO InvoiceDetail (
  LegacyId,InvoiceId,DetailTypeId,Fee,
  FeeTax,Investigatorid,SalespersonId,
  CreateDate,CreatedById,IsChargeBack,
  Expense,RepoAgentId,PayeeName,ExpensePaymentId,
  AdjustDetailId) 
  VALUES(1,1,2,1500.0000,0.0000,163,1002,
    '11/30/2001 12:00:00 AM',
    1116,0,550.0000,850,NULL,1,NULL); 
  set @i = @i+1;
end

select datediff(ms, @start, getutcdate());

This runs in about 12seconds on my server. But adding transaction management and committing every 1000 rows the insert of 100k rows lasts only about 4s:

set nocount on;  
declare @start datetime = getutcdate();  

declare @i int = 0;
begin transaction
while @i < 100000
begin
INSERT INTO InvoiceDetail (
  LegacyId,InvoiceId,DetailTypeId,
  Fee,FeeTax,Investigatorid,
  SalespersonId,CreateDate,CreatedById,
  IsChargeBack,Expense,RepoAgentId,
  PayeeName,ExpensePaymentId,AdjustDetailId) 
  VALUES(1,1,2,1500.0000,0.0000,163,1002,
    '11/30/2001 12:00:00 AM',
    1116,0,550.0000,850,NULL,1,NULL); 
  set @i = @i+1;
  if (@i%1000 = 0)
  begin
    commit
    begin transaction
  end  
end
commit;
select datediff(ms, @start, getutcdate());

Also given that I can insert 100k rows in 12 seconds even w/o the batch commit, while you need 30 minutes, its worth investigating 1) the speed of your IO subsystem (eg. what Avg. Sec per Transaction you see on the drives) and 2) what else is the client code doing between retrieving the @@identity from one call and invoking the next insert. It could be that the bulk of time is in the client side of the stack. One simple solution would be to launch multiple inserts in parallel (BeginExecuteNonQuery) so you feed the SQL Server inserts constantly.

Remus Rusanu