views:

4391

answers:

9

Hi,

Is there a way I can improve this kind of SQL query performance:

INSERT
INTO ...
WHERE NOT EXISTS(Validation...)

The problem is when I have many data in my table (like million of rows), the execution of the WHERE NOT EXISTS clause if very slow. I have to do this verification because I can't insert duplicated data.

I use SQLServer 2005

thx

+5  A: 

Make sure you are searching on indexed columns, with no manipulation of the data within those columns (like substring etc.)

ck
A: 

Pay attention to the other answer regarding indexing. NOT EXISTS is typically quite fast if you have good indexes.

But I have had performance issues with statements like you describe. One method I've used to get around that is to use a temp table for the candidate values, perform a DELETE FROM ... WHERE EXISTS (...), and then blindly INSERT the remainder. Inside a transaction, of course, to avoid race conditions. Splitting up the queries sometimes allows the optimizer to do its job without getting confused.

dwc
+3  A: 

Try to replace the NOT EXISTS with a left outer join, it sometimes performs better in large data sets.

Otávio Décio
Funny, I've more often found the opposite. EXISTS will stop searching at the first match found, whereas a join makes all possible matches. Thus EXISTS ought to be faster. I think.
sfuqua
The thing is, NOT EXISTS will always cause a table scan whereas if you are careful with your join you might be working solely with indexes.
Otávio Décio
A: 

If you can at all reduce your problem space, then you'll gain heaps of performance. Are you absolutely sure that every one of those rows in that table needs to be checked?

The other thing you might want to try is a DELETE InsertTable FROM InsertTable INNER JOIN ExistingTable ON <Validation criteria> before your insert. However, your mileage may vary

hova
+3  A: 

Off the top of my head, you could try something like:

 TRUNCATE temptable
 INSERT INTO temptable ...
 INSERT INTO temptable ... 
 ...
 INSERT INTO realtable
 SELECT temptable.* FROM temptable
 LEFT JOIN realtable on realtable.key = temptable.key
 WHERE realtable.key is null
Blorgbeard
A: 

insert into customers select * from newcustomers where customerid not in (select customerid from customers)

..may be more efficient. As others have said, make sure you've got indexes on any lookup fields.

SqlACID
A: 

Try this:

MERGE
INTO A
USING B
ON a.key = b.key
WHEN NOT MATCHED
  INSERT (a_field1, a_field2)
  VALUES (TO_DATE(b_field1), TO_CHAR(b_field2))

It works only on SQL Server 2008, though.

Quassnoi
While MERGE is well-suited for this operation, is a SQL 2008 feature, and the author has specifically stated, "I use SQLServer 2005"
Cadaeic
Right, didn't notice that.
Quassnoi
A: 

Would this MERGEINTO AUSING BON a.key = b.keyWHEN NOT MATCHED INSERT (a_field1, a_field2) VALUES (TO_DATE(b_field1), TO_CHAR(b_field2)) insert nonmatching values, In other word an outter join product??

A: 

Im struggling with the exact same problem