tags:

views:

45

answers:

1

First we get the max id from the ProductFileLocalName and then - 1000 (as we don't want to delete the most recent additions as they might not be inserted in ProductFileInfo yet)

Then we pass the max id to this stored procedure:

DELETE TOP (10000)
FROM ProductFileLocalName WITH (ROWLOCK)
FROM ProductFileLocalName
    LEFT OUTER JOIN ProductFileInfo AS pfi WITH (NOLOCK) ON ProductFileLocalName.ProductFileLocalNameId = pfi.ProductFileLocalNameId
WHERE (ProductFileLocalName.ProductFileLocalNameId < @maxid AND pfi.ProductFileInfoId IS NULL);

Is this the most effective way to perform this operation?

A: 

If you are really keeping just a 1000 out of a million, do you have an option to copy what you want to keep to a dual table (identical scehema) and then nuke the big one and copy back that small subset?. You would need to measure the timing for this option and check how long continuous delay dealy you can afford.

Another option is to figure out a column that can serve as a partitioning column - assuming that these million records came in during some longer period of time, you can probably esablish safe timing limits and always go after older partition (or partitions), even detaching them first.

As other folks mentioned, you need to put more concrete info into the question if you want people to ponder more particular scenarios instead of guessing - there's no one single tactics for all big deletions.

ZXX