views:

106

answers:

6

Hi,

Is this the most efficient way to delete from DeletedProducts table where there are not references found in the ProductFileInfo table?

Sample:

DELETE FROM DeletedProducts 
WHERE ProductId NOT IN SELECT DISTINCT ProductID FROM ProductFileInfo

Or is NOT EXIST a better way to perform this.

Note: ProductFileInfo has over 20 Million records in it.

SQL Server 2005 Standard is what I am using.

Thanks

A: 

Check the estimated execution plans.

If ProductID is not nullable and you get rid of the DISTINCT you'll probably find they are exactly the same and both use an anti-semi-join.

Martin Smith
+3  A: 

NOT IN and NOT EXISTS will probably produce the same plan. How many rows are you going to delete? If it is a lot I would do batches of 5K or 10K this way you won't fill your LOG with one big transaction and then if it fails for whatever reason it needs to do a big rollback

for example

DELETE top 5000
from sometable
where ....
go 100 --will be executed 100 times

in order for GO N to work you need SSMS and service pack 2 (IIRC) but of course you can also write a while loop..while @@rowcount > 0.......

SQLMenace
A: 

This is a great resource for exactly what your asking.

http://www.sql-server-performance.com/tips/t_sql_where_p3.aspx

I looked at the first couple of tips and they seem extremely dubious (maybe they were once true but they are flat out incorrect for SQL Server 2005+)
Martin Smith
Thats pretty unfair of you, this was written by a well respected author in the community, and I have seen multiple presentations given on multiple pieces of it by other respected presenters as recently as 2 weeks ago.
@daytona - I was a bit surprised as well that Brad McGehee's name was besides that. I wonder if he wrote it some time ago and some one else updated it to say it applied to SQL2005. In SQL 2005+ though [Not In vs Not Exists](http://sqlinthewild.co.za/index.php/2010/02/18/not-exists-vs-not-in/) and [In vs Exists](http://sqlinthewild.co.za/index.php/2009/08/17/exists-vs-in/). Do some experimentation yourself. If you find any different let me know!
Martin Smith
And also of interest [Left outer join vs NOT EXISTS](http://sqlinthewild.co.za/index.php/2010/03/23/left-outer-join-vs-not-exists/)
Martin Smith
A: 

try multiple solutions and test their performance, YMMV. Also try an outer join

DELETE FROM DeletedProducts d left outer join ProductFileInfo p
on d.ProductId  = p.ProductId  WHERE p.ProductID is null
Beth
A: 

Well, I typically write something along the lines of

delete d
from TableToDeleteFrom d
where not exists(select top 1 1 from SomeOtherTable sot where sot.ThatId = d.ThatId)

Probably, you don't want to lock the look-up table, so you can specify with(readpast) hint or with(nolock).

Denis Valeev
A: 

In SQL Server 2008, Delete can run faster compare to the previous version. Please check the followig link to find out more on this http://www.abhyast.com/abhyastcom/post/Did-you-know-You-Can-Make-the-Delete-Faster-by-Using-TOP.aspx