Here's a brain-twister for the SQL guys - can anyone think of a reason why the first of these functions performs fine, and the second one runs dog-slow?
Function A - Typically finishes in ~5 ms
CREATE FUNCTION dbo.GoodFunction
(
@IDs UniqueIntTable READONLY
)
RETURNS TABLE
AS RETURN
SELECT p.ID, p.Node, p.Name, p.Level
FROM
(
SELECT DISTINCT a.Ancestor AS Node
FROM Hierarchy h
CROSS APPLY dbo.GetAncestors(h.Node.GetAncestor(1)) a
WHERE h.ID IN (SELECT Value FROM @IDs)
) np
INNER JOIN Hierarchy p
ON p.Node = np.Node
Function B - Runs extremely slow - I gave up after 5 minutes
CREATE FUNCTION dbo.BadFunction
(
@IDs UniqueIntTable READONLY
)
RETURNS TABLE
AS RETURN
WITH Ancestors_CTE AS
(
SELECT DISTINCT a.Ancestor AS Node
FROM Hierarchy c
CROSS APPLY dbo.GetAncestors(c.Node.GetAncestor(1)) a
WHERE c.ID IN (SELECT Value FROM @IDs)
)
SELECT p.ID, p.Node, p.Name, p.Level
FROM Ancestors_CTE ac
INNER JOIN Hierarchy p
ON p.Node = ac.Node
I'll explain below what this function does, but before I get into that, I want to point out that I don't think it's important, because as far as I can tell, these two functions are exactly the same! The only difference is that one uses a CTE and one uses a subquery; the contents of the subquery in A and the CTE in B are identical.
In case anyone decides this matters: The purpose of this function is just to pick out all the possible ancestors (parent, grandparent, etc.) of an arbitrary number of locations in a hierarchy. The Node
column is a hierarchyid
, and dbo.GetAncestors
is a CLR function that simply walks up the path, it does not do any data access.
UniqueIntTable
is what it implies - it's a user-defined table type with one column, Value int NOT NULL PRIMARY KEY
. Everything here that should be indexed is indexed - the execution plan of function A is essentially just two index seeks and a hash match, as it should be with function B.
Some even stranger aspects to this strange problem:
I'm not even able to get an estimated execution plan for a simple query using function B. It almost looks like the performance issue has something to do with the compilation of this simple-looking function.
If I take the "body" out of function B and just stick it into an inline query, it runs normally, same performance as function A. So it only seems to be a problem with a CTE inside a UDF, or conversely, only with a UDF that uses a CTE.
The CPU usage on one core on the test machine spikes all the way up to 100% when I try to run B. There doesn't seem to be much I/O.
I want to just shrug it off as a SQL Server bug and use version A, but I always try to keep Rule #1 ("SELECT Ain't Broken") in mind, and I'm concerned that the good results from function A are somehow a localized fluke, that it will "fail" the same way that B does on a different server.
Any ideas?
UPDATE - I'm now including a complete self-contained script to reproduce.
GetAncestors Function
[SqlFunction(FillRowMethodName = "FillAncestor",
TableDefinition = "Ancestor hierarchyid", IsDeterministic = true,
IsPrecise = true, DataAccess = DataAccessKind.None)]
public static IEnumerable GetAncestors(SqlHierarchyId h)
{
while (!h.IsNull)
{
yield return h;
h = h.GetAncestor(1);
}
}
Schema Creation
BEGIN TRAN
CREATE TABLE Hierarchy
(
ID int NOT NULL IDENTITY(1, 1)
CONSTRAINT PK_Hierarchy PRIMARY KEY CLUSTERED,
Node hierarchyid NOT NULL,
[Level] as Node.GetLevel(),
Name varchar(50) NOT NULL
)
CREATE INDEX IX_Hierarchy_Node
ON Hierarchy (Node)
INCLUDE (Name)
CREATE INDEX IX_Hierarchy_NodeBF
ON Hierarchy ([Level], Node)
GO
INSERT Hierarchy (Node, Name)
SELECT CAST('/1/' AS hierarchyid), 'Alice' UNION ALL
SELECT CAST('/1/1/' AS hierarchyid), 'Bob' UNION ALL
SELECT CAST('/1/1/1/' AS hierarchyid), 'Charles' UNION ALL
SELECT CAST('/1/1/2/' AS hierarchyid), 'Dave' UNION ALL
SELECT CAST('/1/1/3/' AS hierarchyid), 'Ellen' UNION ALL
SELECT CAST('/1/2/' AS hierarchyid), 'Fred' UNION ALL
SELECT CAST('/1/3/' AS hierarchyid), 'Graham' UNION ALL
SELECT CAST('/1/3/1/' AS hierarchyid), 'Harold' UNION ALL
SELECT CAST('/1/3/2/' AS hierarchyid), 'Isabelle' UNION ALL
SELECT CAST('/1/4/' AS hierarchyid), 'John' UNION ALL
SELECT CAST('/2/' AS hierarchyid), 'Karen' UNION ALL
SELECT CAST('/2/1/' AS hierarchyid), 'Liam' UNION ALL
SELECT CAST('/2/2/' AS hierarchyid), 'Mary' UNION ALL
SELECT CAST('/2/2/1/' AS hierarchyid), 'Nigel' UNION ALL
SELECT CAST('/2/2/2/' AS hierarchyid), 'Oliver' UNION ALL
SELECT CAST('/2/3/' AS hierarchyid), 'Peter' UNION ALL
SELECT CAST('/2/3/1/' AS hierarchyid), 'Quinn'
GO
CREATE TYPE UniqueIntTable AS TABLE
(
Value int NOT NULL,
PRIMARY KEY (Value)
)
GO
COMMIT
GO
The above code/script can be used to create the CLR function/DB schema; use the same GoodFunction
and BadFunction
scripts in the original.