views:

851

answers:

6

First note that I have seen this question:TSQL delete with an inner join

I have a large table and several foreign key relations, each of which have data of a given age. We need to remove data older than a given data on a regular basis to stop the DB from growing without bound.

I'm writing a query that will delete from each point on the star if you will by the given parameters (unfortunately these are configurable and different between the tables).

After this first deletion, I have a central table that I'm worried that I'm doing twice the work attempting to delete, as on delete the database checks the conditionals. I have a set of:

AND NOT EXISTS
(SELECT key 
FROM table 
WHERE table.key = centretable.key)

which TSQL is making into a right anti semi join and doing it nicely on the indexes. The problem is it creates a list of stuff to delete and then does the same checks again as it performs the delete.

I guess my question is whether there is a try delete by row, (I'm not going to do that in a cursor as I know how slow it would be), but you would think that such a keyword would exist, I haven't had any luck finding it though.

A: 

If you're reusing the same list of stuff to delete then you could consider inserting the keys to delete into a temp table and then using this in the second query.

SELECT Key, ...
INTO #ToDelete
FROM Table T
WHERE ...

Then something like this

...
LEFT OUTER JOIN #ToDelete D
ON T.Key=D.Key
WHERE D.Key IS NULL

DROP #ToDelete
pjp
The query optimiser did this already. I'm checking to see if there is a keyword to delete when not linked so I can do it in one pass.
Spence
A: 

If you specified the foreign key as a constraint when creating the table in the database you can tell the database what to do in case of a delete, by setting the delete rule. This rule specifies what happens if a user tries to delete a row with data that is involved in a foreign key relationship. The "No action" setting tells the user that the deletion is not allowed and the DELETE is rolled back. Implementing it like that would keep you from checking it yourself before deleting it, and thus could be seen as some kind of try. Well, at least it works like that in MS SQL. http://msdn.microsoft.com/en-us/library/ms177288.aspx

gijswijs
I want the opposite. I'm not trying to cascade the delete. I'm trying to delete anything that is not connected to anything else.
Spence
+2  A: 

In terms of a single command that only checks the relationships once (rather than twice in your example - once for the NOT EXISTS, once for the DELETE), then I expect the answer is a big fat no, sorry.

(off the wall idea): If this is a major problem, you could try some kind of reference-counting implementation, using triggers to update the counter - but in reality I expect this will be a lot more overhead to maintain than simply checking the keys like you are already.

You could also investigate NOCHECK during the delete (since you are checking it yourself); but you can only do this at the table level (so probably OK for admin scripts, but not for production code) - i.e.:

-- disable
alter table ChildTableName nocheck constraint ForeignKeyName

-- enable
alter table ChildTableName check constraint ForeignKeyName

A quick test shows that with it enabled it does an extra Clustered Index Scan on the foreign key; with it disabled, this is omitted.

Here's a full example; you can look at the query plan of the two DELETE operations... (ideally in isolation from the rest of the code):

create table parent (id int  primary key)
create table child (id int  primary key, pid int)
alter table child add constraint fk_parent foreign key (pid)
    references parent (id)

insert parent values (1)
insert parent values (2)
insert child values (1,1)
insert child values (2,1)

-- ******************* THIS ONE CHECKS THE FOREIGN KEY
delete from parent
where not exists (select 1 from child where pid = parent.id)

-- reset
delete from child
delete from parent
insert parent values (1)
insert parent values (2)
insert child values (1,1)
insert child values (2,1)

-- re-run with check disabled
alter table child nocheck constraint fk_parent

-- ******************* THIS ONE DOESN'T CHECK THE FOREIGN KEY    
delete from parent
where not exists (select 1 from child where pid = parent.id)

-- re-enable
alter table child check constraint fk_parent

Again - I stress this should only be run from things like admin scripts.

Marc Gravell
Unfortunately they are foreign keys not check constraints. Cheers for your input though.
Spence
I'll leave it for a few days and give you the upvote if no one counters your assertion.
Spence
Doi, FKeys in TSQL are check constraints, albeit special ones. Its a valid suggestion and in the transaction might be the only way to do it if performance becomes an issue.
Spence
But doing this will leave you with untrusted constraints, that might impact future queries: See http://sqlblog.com/blogs/hugo%5Fkornelis/archive/2007/03/29/can-you-trust-your-constraints.aspx You can re-enable with WITH CHECK option, but that makes another pass through the table which is what you are trying to avoid.
Shannon Severance
@Shannon - hence my point about using it from admin scripts only
Marc Gravell
A: 

I did find one article that discusses using an outer join in a delete: http://www.bennadel.com/blog/939-Using-A-SQL-JOIN-In-A-SQL-DELETE-Statement-Thanks-Pinal-Dave-.htm

I hope this works for you!

scwagner
A: 

The short answer to your question is no, there is no standard RDBMS keyword for deleting a master record when all foreign key references to it go away (and certainly none that would account for foreign keys in multiple tables).

Your most efficient option is a second query that is run on an as-needed basis to delete from "centre" based on a series of NOT EXISTS() clauses for each of the tables with foreign keys.

This is based on two statements I believe are both true for your situation:

  1. You will delete more "related" records than "centre" (parent) records. Thus, any operation that attempts to adjust "centre" every time you delete from one of the other tables will result in an instantaneous update to "centre", but will require much wasted querying to delete a "centre" record only occasionally.

  2. Given that there are multiple points on the star from "centre," any "wasted effort" checking for a foreign key in one of them is minimal compared to the whole. For instance, if there are four foreign keys to check before deleting from "centre", you can only save, at best, 25% of the time.

richardtallent
+1  A: 

You could create an Indexed view of your select sentence:

SELECT key FROM table WHERE table.key = centretable.key

The indexed view is a physical copy of the data it would therefore be very fast to check.

You do have the overhead of updating the view, so you would need to test this against your usage pattern.

Shiraz Bhaiji