views:

416

answers:

1

I've been analyzing a recurring "bug report" (perf issue) in one of our systems related to a particularly slow delete operation. Long story short: It seems that the CASCADE DELETE keys were largely responsible, and I'd like to know (a) if this makes sense, and (b) why it's the case.

We have a schema of, let's say, widgets, those being at the root of a large graph of related tables and related-to-related tables and so on. To be perfectly clear, deleting from this table is actively discouraged; it is the "nuclear option" and users are under no illusions to the contrary. Nevertheless, it sometimes just has to be done.

The schema looks something like this:

Widgets
   |
   +--- Anvils [1:1]
   |    |
   |    +--- AnvilTestData [1:N]
   |
   +--- WidgetHistory (1:N)
        |
        +--- WidgetHistoryDetails (1:N)

Column definitions look like the following:

Widgets (WidgetID int PK, WidgetName varchar(50))
Anvils (AnvilID int PK, WidgetID int FK/IX/UNIQUE, ...)
AnvilTestData (AnvilID int FK/IX, TestID int, ...Test Data...)
WidgetHistory (HistoryID int PK, WidgetID int FK/IX, HistoryDate datetime, ...)
WidgetHistoryDetails (HistoryID int FK/IX, DetailType smallint, ...)

Nothing too scary, really. A Widget can be different types, an Anvil is a special type, so that relationship is 1:1 (or more accurately 1:0..1). Then there's a large amount of data - perhaps thousands of rows of AnvilTestData per Anvil collected over time, dealing with hardness, corrosion, exact weight, hammer compatibility, usability issues, and impact tests with cartoon heads.

Then every Widget has a long, boring history of various types of transactions - production, inventory moves, sales, defect investigations, RMAs, repairs, customer complaints, etc. There might be 10-20k details for a single widget, or none at all, depending on its age.

So, unsurprisingly, there's a CASCADE DELETE relationship at every level here. If a Widget needs to be deleted, it means something's gone terribly wrong and we need to erase any records of that widget ever existing, including its history, test data, etc. Again, nuclear option.

Relations are all indexed, statistics are up to date. Normal queries are fast. The system tends to hum along pretty smoothly for everything except deletes.

Getting to the point here, finally, for various reasons we only allow deleting one widget at a time, so a delete statement would look like this:

DELETE FROM Widgets
WHERE WidgetID = @WidgetID

Pretty simple, innocuous looking delete... that takes over 2 minutes to run, for a widget with no data!

After slogging through execution plans I was finally able to pick out the AnvilTestData and WidgetHistoryDetails deletes as the sub-operations with the highest cost. So I experimented with turning off the CASCADE (but keeping the actual FK, just setting it to NO ACTION) and rewriting the script as something very much like the following:

DECLARE @AnvilID int
SELECT @AnvilID = AnvilID FROM Anvils WHERE WidgetID = @WidgetID

DELETE FROM AnvilTestData
WHERE AnvilID = @AnvilID

DELETE FROM WidgetHistory
WHERE HistoryID IN (
    SELECT HistoryID
    FROM WidgetHistory
    WHERE WidgetID = @WidgetID)

DELETE FROM Widgets WHERE WidgetID = @WidgetID

Both of these "optimizations" resulted in significant speedups, each one shaving nearly a full minute off the execution time, so that the original 2-minute deletion now takes about 5-10 seconds - at least for new widgets, without much history or test data.

Just to be absolutely clear, there is still a CASCADE from WidgetHistory to WidgetHistoryDetails, where the fanout is highest, I only removed the one originating from Widgets.

Further "flattening" of the cascade relationships resulted in progressively less dramatic but still noticeable speedups, to the point where deleting a new widget was almost instantaneous once all of the cascade deletes to larger tables were removed and replaced with explicit deletes.

I'm using DBCC DROPCLEANBUFFERS and DBCC FREEPROCCACHE before each test. I've disabled all triggers that might be causing further slowdowns (although those would show up in the execution plan anyway). And I'm testing against older widgets, too, and noticing a significant speedup there as well; deletes that used to take 5 minutes now take 20-40 seconds.

Now I'm an ardent supporter of the "SELECT ain't broken" philosophy, but there just doesn't seem to be any logical explanation for this behaviour other than crushing, mind-boggling inefficiency of the CASCADE DELETE relationships.

So, my questions are:

  • Is this a known issue with DRI in SQL Server? (I couldn't seem to find any references to this sort of thing on Google or here in SO; I suspect the answer is no.)

  • If not, is there another explanation for the behaviour I'm seeing?

  • If it is a known issue, why is it an issue, and are there better workarounds I could be using?

+4  A: 

SQL Server is best at set-based operations, while CASCADE deletions are, by their nature, record-based.

SQL Server, unlike the other servers, tries to optimize the immediate set-based operations, however, it works only one level deep. It needs to have the records deleted in the upper-level tables to delete those in the lower-level tables.

In other words, cascading operations work up-down, while your solution works down-up, which is more set-based and efficient.

Here's a sample schema:

CREATE TABLE t_g (id INT NOT NULL PRIMARY KEY)

CREATE TABLE t_p (id INT NOT NULL PRIMARY KEY, g INT NOT NULL, CONSTRAINT fk_p_g FOREIGN KEY (g) REFERENCES t_g ON DELETE CASCADE)

CREATE TABLE t_c (id INT NOT NULL PRIMARY KEY, p INT NOT NULL, CONSTRAINT fk_c_p FOREIGN KEY (p) REFERENCES t_p ON DELETE CASCADE)

CREATE INDEX ix_p_g ON t_p (g)

CREATE INDEX ix_c_p ON t_c (p)

, this query:

DELETE
FROM    t_g
WHERE   id > 50000

and its plan:

  |--Sequence
       |--Table Spool
       |    |--Clustered Index Delete(OBJECT:([test].[dbo].[t_g].[PK__t_g__176E4C6B]), WHERE:([test].[dbo].[t_g].[id] > (50000)))
       |--Index Delete(OBJECT:([test].[dbo].[t_p].[ix_p_g]) WITH ORDERED PREFETCH)
       |    |--Sort(ORDER BY:([test].[dbo].[t_p].[g] ASC, [test].[dbo].[t_p].[id] ASC))
       |         |--Table Spool
       |              |--Clustered Index Delete(OBJECT:([test].[dbo].[t_p].[PK__t_p__195694DD]) WITH ORDERED PREFETCH)
       |                   |--Sort(ORDER BY:([test].[dbo].[t_p].[id] ASC))
       |                        |--Merge Join(Inner Join, MERGE:([test].[dbo].[t_g].[id])=([test].[dbo].[t_p].[g]), RESIDUAL:([test].[dbo].[t_p].[g]=[test].[dbo].[t_g].[id]))
       |                             |--Table Spool
       |                             |--Index Scan(OBJECT:([test].[dbo].[t_p].[ix_p_g]), ORDERED FORWARD)
       |--Index Delete(OBJECT:([test].[dbo].[t_c].[ix_c_p]) WITH ORDERED PREFETCH)
            |--Sort(ORDER BY:([test].[dbo].[t_c].[p] ASC, [test].[dbo].[t_c].[id] ASC))
                 |--Clustered Index Delete(OBJECT:([test].[dbo].[t_c].[PK__t_c__1C330188]) WITH ORDERED PREFETCH)
                      |--Table Spool
                           |--Sort(ORDER BY:([test].[dbo].[t_c].[id] ASC))
                                |--Hash Match(Inner Join, HASH:([test].[dbo].[t_p].[id])=([test].[dbo].[t_c].[p]))
                                     |--Table Spool
                                     |--Index Scan(OBJECT:([test].[dbo].[t_c].[ix_c_p]), ORDERED FORWARD)

First, SQL Server deletes records from t_g, then joins the records deleted with t_p and deletes from the latter, finally, joins records deleted from t_p with t_c and deletes from t_c.

A single three-table join would be much more efficient in this case, and this is what you do with your workaround.

If it makes you feel better, Oracle does not optimize cascade operations in any way: they are always NESTED LOOPS and God help you if your forgot to create an index on the referencing column.

Quassnoi
Interesting... does that imply that if I have more nesting levels, say 4 or 5, each with 100-1000 row fanouts, that it's only "safe" (performance-wise) to have `CASCADE` relationships one level above the leaf level?
Aaronaught
@Aaronaught: right. Spooling requires extra work and the index statistics are not always passed correctly across the queries.
Quassnoi
So if I've done my math right, a `CASCADE DELETE` is fundamentally an O(N^X-1) operation, with *N* being the average fanout and *X* being the nesting level. Puts a whole new spin on database design... have to start thinking about the nesting level whenever I'm about to add another `CASCADE`. Is this documented anywhere? I'm just wondering if it's something I should have known, or if it's one of those "hidden features."
Aaronaught
@Aaronaught: It's more about optimizer. You have three tables: `A -> B -> C`. With plain `DELETE FROM C JOIN B JOIN A`, optimizer considers all possible join methods with all three tables in any order. With cascade deletes, it's two queries: `A JOIN B` and `B JOIN C`, in this order, unlinked, and the results need to be cached in between. This is a restriction which, like in your case, robs the optimizer of the possibility to choose the really optimal way.
Quassnoi
Personally we simply don't use cascading deletes ever. What if you tried to delete a record that had thousands of child records in 30 related tables? Two minutes is fast compared to what that would do. And you'd probably end up with a lot of users being blocked while it happened as well. And many times the existance of the child record is exactly what you need to know about to stop you from deleting that record.
HLGEM
@HLGEM: but what if you *need* to delete that record and all of its children?
Quassnoi
Precisely. I was clear in the question about it being a last resort - such deletions are extremely rare, but they happen, and when they do it's usually urgent (and yes, it does block other users). Looks like this answer is pretty much indisputable; I'd ask why SQL can't/doesn't optimize this but I suppose that's beyond the scope of this question.
Aaronaught
Then you write delete staments to do it and possibly to do the chilren in batches for performance.
HLGEM