views:

3612

answers:

9

When setting up foreign keys in SQL Server, under what circumstances should you have it cascade on delete or update, and what is the reasoning behind it?

This probably applies to other databases as well.

I'm looking most of all for concrete examples of each scenario, preferably from someone who has used them successfully.

+4  A: 

One example is when you have dependencies between entities... ie: Document -> DocumentItems (when you delete Document, DocumentItems don't have a reason to exist)

Juan Manuel
+1  A: 

I try to avoid deletes or updates that I didn't explicitly request in SQL server.

Either through cascading or through the use of triggers. They tend to bite you in the ass some time down the line, either when trying to track down a bug or when diagnosing performance problems.

Where I would use them is in guaranteeing consistency for not very much effort. To get the same effect you would have to use stored procedures.

pauliephonic
+9  A: 

I never use cascading deletes.

If I want something removed from the database I want to explicitly tell the database what I want taking out.

Of course they are a function available in the database and there may be times when it is okay to use them, for example if you have an 'order' table and an 'orderItem' table you may want to clear the items when you delete an order.

I like the clarity that I get from doing it in code (or stored procedure) rather than 'magic' happening.

For the same reason I am not a fan of triggers either.

Something to notice is that if you do delete an 'order' you will get '1 row affected' report back even if the cascaded delete has removed 50 'orderItem's.

Aidan
Why not get rid of primary keys, too? You would get the clarity of ensuring unique values in your code.
MusiGenesis
Snark aside, I actually work with a database system that has no primary keys defined on any tables.
MusiGenesis
@MusiGenesis, Aidan was not advocating removing the FK. The FK still protects the data, but without CASCADE ON .... unexpected magic does not happen.
Shannon Severance
@Shannon how the hell is it magic?? It may be unexpected, but that has to show up on testing. Do you always work with undocumented databases and without testing your code? I really have a hard time understanding this whole 'magic' thing around cascades.
Vinko Vrsalovic
Shannon Severance
@Shannon: Oh, sure, **that** is sensible. Implement FK cascading in SPs.
Vinko Vrsalovic
@Vinko. the problem of magic is not with competent developers or DBA's, it's with Joe interen 5 years later who's been given a 'simple'maintenance task when the DBA is on holiday and who then screws up corporate data without anyone realising it. Cascades have their place, but it's important to consider the full circumstances, including human factors, before deploying them.
Cruachan
@Cruachan: If the 'interen' can screw up production data just because you used cascade it shows you used cascade wrongly. And also shows you have crappy backup procedures.
Vinko Vrsalovic
@Vinko: Well yes, I guess it was used wrongly, but that's part of the issue - cascade makes you more vulnerable to bad decisions, or sometimes even decisions that were right at the time, but some years later were wrong because of other database changes. Those sort of problems are ridiculously to run into in a corporate environments with dozens of apps against a central database. Also backups are not much of a help if obscure but important data deletes and that is only identified 6 months down the line - even if you have a backup at the appropriate date restore is unlikely to be straightforward
Cruachan
@Vinko: I'm not really disagreeing with anything you say per see, it's simply that experience teaches me to be uber-cautious about cascades as unexpected consequences will turn around and bite you far harder if you have some error with them than without. In most circumstances I'd rather protect the data at the expense of some inconvenience than take the simpler but more vulnerable way out, even if the cascade is in some computer science way more 'correct'.
Cruachan
@Cruachan: The thing is that the alternative to implementing cascades in the DB is to implement them in your app's code (or, gasp, in SPs). That code will still bite you years ahead when the requirement or the environment changes. So there's really no good way out of it, unless what you are really advocating is to avoid deletion at all and just use soft deletes everywhere. Which might make sense in some cases, still the problem is not with the cascading in the database, is with cascading changes in general (whatever the hell does it.) Soft-deletion might indeed be the way to go in these ...
Vinko Vrsalovic
... environments where you never know when you might need some data that was entered years ago.
Vinko Vrsalovic
@Vinko: Why 'Gasp' SPs? SPs are defiantly the way to go where the database is a critical corporate asset. There's a strong argument in the sort of circumstances were talking about to restrict *all* data accesses to SPs, or at the least all but Select. See my answer under http://stackoverflow.com/questions/1171769/when-should-i-use-stored-procedures/1171981#1171981
Cruachan
@Cruachan: I was referring only and especifically to the replication of cascade delete in SPs which to me seems totally crazy. Using cascade in the table definition works at the same level of security and control.
Vinko Vrsalovic
+16  A: 

Foreign keys are the best way to ensure referential integrity of a database. Avoiding cascades due to being magic is like writing everything in assembly because you don't trust the magic behind compilers.

What is bad is the wrong use of foreign keys, like creating them backwards, for example.

Juan Manuel's example is the canonical example, if you use code there are many more chances of leaving spurious DocumentItems in the database that will come and bite you.

Cascading updates are useful, for instance, when you have references to the data by something that can change, say a primary key of a users table is the name,lastname combination. Then you want changes in that combination to propagate to wherever they are referenced.

@Aidan, That clarity you refer to comes at a high cost, the chance of leaving spurious data in your database, which is not small. To me, it's usually just lack of familiarity with the DB and inability to find which FKs are in place before working with the DB that foster that fear. Either that, or constant misuse of cascade, using it where the entities were not conceptually related, or where you have to preserve history.

Vinko Vrsalovic
Using that sort of 'natural' primary key is a really poor idea in the first place.
Nick Johnson
The idea was to show an example about the cascading updates, I agree it's not the best example though. File locations may be a better example.
Vinko Vrsalovic
RE: Comment directed to Aidan. No, leaving off CASCADE on an FK does not increase the chance of leaving spurious data. It decreases the chance that more data will be impacted by a command than was expected and increase code. Leaving out FKs entirely leave a chance of spurious data.
Shannon Severance
@Shannon, yes, what you say is true. Back then when I edited the answer I read it as if he was encouraging leaving off FKs. In any case, what I was really aiming at is that I truly don't understand why is people so afraid about cascades, as if knowing where a FK is set to cascade was unknowable. It may decrease the chance that more data is impacted, but at the cost of more code in the application that could be handled in the DB at lower overall cost. If you are cascading when cascaded entries really make no sense after the original entry is deleted then there is really no downside.
Vinko Vrsalovic
Having at least twice in my career seen the business-threatening consequences of a misunderstood cascade delete I'm very disinclined to use them myself in all except the most clear cut cases. In both cases data had been deleted as a result of a cascade that really should have been retained but wasn't - and that it was missing was not detected until the normal backup cycle had lost the possibility of an easy restore. Vinko is correct from a purely logical point of view, however in the real world using cascades exposes one to the human fallibility and unforeseen consequences more than I'd like.
Cruachan
I've actually coded systems where a management decision has been made that the users will have to explicitly delete all children in a master-detail before they can delete the master simply to force the user to think. Not using cascades when logically one could is a similar sort of firebreak.
Cruachan
@Cruachan: The rule, in my view, is simple. If the data is not as strongly related as to be useless without the parent data, then it doesn't warrant a cascade relationship. This I what I tried to address in the last phrase on my answer.
Vinko Vrsalovic
+4  A: 

I do a lot of database work and rarely find cascade deletes useful. The one time I have used them effectively is in a reporting database that is updated by a nightly job. I make sure that any changed data is imported correctly by deleting any top level records that have changed since the last import, then reimport the modified records and anything that relates to them. It save me from having to write a lot of complicated deletes that look from the bottom to the top of my database.

I don't consider cascade deletes to be quite as bad as triggers as they only delete data, triggers can have all kinds of nasty stuff inside.

In general I avoid real Deletes altogether and use logical deletes (ie. having a bit column called isDeleted that gets set to true) instead.

Martynnw
+8  A: 

Summary of what I've seen so far:

  • Some people don't like cascading at all.
    .

  • Cascade Delete may make sense when the semantics of the relationship can involve an "is part of" description. For example, and OrderLine record is part of it's parent order.

  • The canonical example for Cascade Delete is SomeObject and SomeObjectItems, where it doesn't make any sense for an items record to ever exist without a corresponding main record.
  • You should not use Cascade Delete if you are preserving history or using a "soft/logical delete" where you only set a deleted bit column to 1/true.
    .

  • Cascade Update may make sense when you use a real key rather than a surrogate key (identity/autoincrement column) across tables.

  • The canonical example for Cascade Update is when you have a mutable foreign key, like a username that can be changed.
  • You should not use Cascade Update with keys that are Identity/autoincrement columns.
    .

  • You may want to get an extra strong confirmation back from the user before allowing an operation to cascade, but it depends on your application.

  • Cascading can get you into trouble if you set up your foreign keys wrong. But you should be okay if you do that right.
  • It's not wise to use cascading before you understand it thoroughly. However, it is a useful feature and therefore worth taking the time to understand.
Joel Coehoorn
Note that cascade updates are also often used where the "so-called" natural keys appear not to be these real effective unique keys. In fact I am convinced that cascade updates are needed only with poorly normalised database models, and they are an open gate to messy tables and messy code.
Philippe Grondier
You are missing one important point, cascading can create huge performance issues if there are many child records.
HLGEM
@HLGEM - I don't see the relevance. If a cascade operations causes a slow down, the equivalent manual process would either cause the same slow down or not be correctly protected in case the transaction needs to be rolled back.
Joel Coehoorn
Why would it matter whether there's a cascade update on an IDENTITY or auto-increment column? I can see why it wouldn't be *necessary* because you shouldn't need to change those (arbitrary) values, but if one of them *did* change, at least the referential integrity would be intact.
Kenny Evitt
10 Bullets? Well now we know Joel isn't firing a revolver.
Neil N
+1  A: 

I, like everyone else here, find that cascade deletes are really only marginally helpful (it's really not that much work to delete referenced data in other tables -- if there are lot of tables, you simply automate this with a script) but really annoying when someone accidentally cascade deletes some important data that is difficult to restore.

The only case where I'd use is if the data in the table table is highly controlled (e.g., limited permissions) and only updated or deleted from through a controlled process (like a software update) that has been verified.

Jen A
+2  A: 

I work a lot with cascading deletes.

It feels good to know whoever works against the database might never leave any unwanted data. If dependencies grow I just change the constraints in the diagramm in Management Studio and I dont have to tweak sp or dataacces.

That said, I have 1 problem with cascading deletes and thats circular references. This often leads to parts of the database that have no cascading deletes.

Malcolm Frexner
+2  A: 

One reason to put in a cascade delete (rather than doing it in the code) is to improve performance.

Case 1: With a cascade delete

 DELETE FROM table WHERE SomeDate < 7 years ago;

Case 2: Without a cascade delete

 FOR EACH R IN (SELECT FROM table WHERE SomeDate < 7 years ago) LOOP
   DELETE FROM ChildTable WHERE tableId = R.tableId;
   DELETE FROM table WHERE tableId = R.tableid;
   /* More child tables here */
 NEXT

Secondly, when you add in an extra child table with a cascade delete, the code in Case 1 keeps working.

I would only put in a cascade where the semantics of the relationship is "part of". Otherwise some idiot will delete half of your database when you do:

DELETE FROM CURRENCY WHERE CurrencyCode = 'USD'
WW
Not knowing which datbase you use, I would suggest that your manual delete performs worse than cascading delete because it is not set based. In most datbases you can delete based on a join to another table and so have a set-based, much faster delete than looping through records.
HLGEM