views:

110

answers:

5

The lead developer on a project I'm involved in says it's bad practice to rely on cascades to delete related rows.

I don't see how this is bad, but I would like to know your thoughts on if/why it is.

+2  A: 

I think this will mostly boil down to opinions, so I'll just give my thinking/reasoning.

Use them. Why not? They're there for a reason, there's some things a RDBMS does better and faster. Think of it this way, when you're inserting a unique key do you not rely on that constraint? Do you query distinct against the table and check the value's not in use before inserting...no of course not that'd be silly and ridiculously slower. So why then would we refuse to use other perfectly legitimate features the RDBMS offers?

Nick Craver
Totally disagree.
Mitch Wheat
@Mitch - Like I said, highly opinionated :) I'd welcome your counter-thoughts on the issue!
Nick Craver
@ Nick Craver : added an answer...
Mitch Wheat
@Nick Craver: it is not a matter of opinion. It's common sense. I know of no professional DBA or developer that uses cascading deletes.
Mitch Wheat
@Mitch - You can find several here: http://stackoverflow.com/questions/59297/whenwhy-to-use-cascading-in-sql-server Also, I'm Nick, nice to meet you...I'm paid to do this (definition of professional!) and have used them. Also, I've worked with dozens of DBAs that use it...why do you think the feature's there if it's 100% evil?
Nick Craver
@Nick: the majority of answers in the link you supplied agree that cascading deletes is a bad idea.
Mitch Wheat
@Nick: you've stated that you've worked with dozens of DBAs. If you're profile is correct you must be changing jobs alot to have worked with at least 24 DBAs in 4-6 years!!
Mitch Wheat
@Mitch - The companies I've worked with aren't small. When you have 130k+ employees, you tend to have more than a few DBAs...
Nick Craver
+6  A: 

I'll preface this by saying that I rarely delete rows period. Generally most data you want to keep. You simply mark it as deleted so it won't be shown to users (ie to them it appears deleted). Of course it depends on the data and for some things (eg shopping cart contents) actually deleting the records when the user empties his or her cart is fine.

I can only assume that the issue here is you may unintentionally delete records you don't actually want to delete. Referential integrity should prevent this however. So I can't really see a reason against this other than the case for being explicit.

cletus
I have to agree with cletus here, unless there's a legal or real capacity issue...storage is so cheap and fast there's rarely a real reason to do a hard delete these days. +1
Nick Craver
Here's one reason - deleted data creeping back into query results because someone forgot an `AND DELETED = 0`. If you do this, hide all deleted data behind a live-data-only view or similar, or you **will** run into problems.
Michael Petrotta
@Michael - Depends on your architecture, for example I have a linq project, and in the T4 templates it's trivial to add a `.Current()` extension method call in the `DataContext.GetTable<T>` references it generates. I could see how this would be a problem in other situations, but not *always*.
Nick Craver
@Nick: True. But, you never touch your data with Query Analyzer, not even to to DBA work? No external reporting tools, Excel spreadsheets, etc.?
Michael Petrotta
Thats what views are for. Name the table `foo_historical` and the view `foo` if you have to.
Joe Koberg
@Michael - We do use external query tools, but typically LinqPad (so non-issue) for quick queries or Toad. I agree it could definitely lead to mistakes there. However, specific to my current project: cases when there people are looking at data that matters, they're going against a warehouse that doesn't import deleted rows. I guess a bit of a special situation there, so it could definitely be much worse in that area.
Nick Craver
+1  A: 

I never use cascading deletes. Why? Because it is too easy to make a mistake. Much safer to require client applications to explicitly delete (and meet the conditions for deletion, such as deleting FK referred records.)

In fact, deletions per se can be avoided by marking records as deleted or moving into archival/history tables.

In the case of marking records as deleted, it depends on the relative proportion of marked as deleted data, since SELECTs will have to filter on 'isDeleted = false' an index will only be used if less than 10% (approximately, depending on the RDBMS) of records are marked as deleted.

Which of these 2 scenarios would you prefer:

1) Developer comes to you, says "Hey, this delete won't work". You both look into it and find that he was accidently trying to delete entire table contents. You both have a laugh, and go back to what you were doing.

2) Developer comes to you, and sheepishly asks "Do we have backups?"

Mitch Wheat
Totally disagree :)
Nick Craver
If I have say 3+ levels of really trivial data, one delete call can take care of it. In a high-transaction environment where deletes are common, I don't need the app waiting orders of magnitude longer to do the same thing. They have their uses...otherwise no one would have taken the time to implement them, and they wouldn't be in every major RDBMS.
Nick Craver
since when would you be deleting trivial data? Trivial data is usually static. That argument doesn't hold water.
Mitch Wheat
Let's say I'm backing an order out of the system: Order -> Line Items -> Comments. If I delete the order (confirmation prompt in the app, naturally), the rows in the other 2 tables are now useless. If you aren't setup to recover the parent record(n which case, you aren't deleting...) often the children no longer matter, easier to cascade delete them all together in one database hit from the start. Like I said, matter of application and opinion, but they **do** have their uses.
Nick Craver
A: 

I would say that you follow the principle of least surprise.

Cascading deletes should not cause unexpected loss of data. If a delete requires related records to be deleted, and the user needs to know that those records are going to go away, then cascading deletes should not be used. Instead, the user should be required to explicitly delete the related records, or be provided a notification.

On the other hand, if the table relates to another table that is temporary in nature, or that contains records that will never be needed once the parent entity is gone, then cascading deletes may be OK.

That said, I prefer to state my intentions explicitly by deleting the related records in code, rather than relying on cascading deletes. In fact, I've never actually used a cascading delete to implicitly delete related records. Also, I use soft deletion a lot, as described by cletus.

Robert Harvey
A: 

Another huge reason to avoid cascading deletes is performance. They seem like a good idea until you need to delete 10,000 records from the main table which in turn have millions of records in child tables. Given the size of this delete, it is likely to completely lock down all of the table for hours maybe even days. Why would you ever risk this? For the convenience of spending ten minutes less time writing the extra delete statements for one record deletes?

Further, the error you get when you try to delete a record that has a child record is often a good thing. It tells you that you don't want to delete this record becasue there is data that you need that you would lose if you did so. Cascade delete would just go ahead and delete the child records resulting in loss of information about orders for instance if you deleted a customer who had orders in the past. This sort of thing can thoroughly mess up your financial records.

HLGEM