views:

298

answers:

2

This has been bugging me for a while and I'm hoping that one of the SQL Server experts can shed some light on it.

The question is:

When you index a SQL Server column containing a UDT (CLR type), how does SQL Server determine what index operation to perform for a given query?

Specifically I am thinking of the hierarchyid (AKA SqlHierarchyID) type. The way Microsoft recommends that you use it - and the way I do use it - is:

  • Create an index on the hierarchyid column itself (let's call it ID). This enables a depth-first search, so that when you write WHERE ID.IsDescendantOf(@ParentID) = 1, it can perform an index seek.

  • Create a persisted computed Level column and create an index on (Level, ID). This enables a breadth-first search, so that when you write WHERE ID.GetAncestor(1) = @ParentID, it can perform an index seek (on the second index) for this expression.

But what I don't understand is how is this possible? It seems to violate the normal query plan rules - the calls to GetAncestor and IsDescendantOf don't appear to be sargable, so this should result in a full index scan, but it doesn't. Not that I am complaining, obviously, but I am trying to understand if it's possible to replicate this functionality on my own UDTs.

Is hierarchyid simply a "magical" type that SQL Server has a special awareness of, and automatically alters the execution plan if it finds a certain combination of query elements and indexes? Or does the SqlHierarchyID CLR type simply define special attributes/methods (similar to the way IsDeterministic works for persisted computed columns) that are understood by the SQL Server engine?

I can't seem to find any information about this. All I've been able to locate is a paragraph stating that the IsByteOrdered property makes things like indexes and check constraints possible by guaranteeing one unique representation per instance; while this is somewhat interesting, it doesn't explain how SQL Server is able to perform a seek with certain instance methods.

So the question again - how do the index operations work for types like hierarchyid, and is it possible to get the same behaviour in a new UDT?

+2  A: 

The query optimizer team is trying to handle scenarios that don't change the order of things. For example, cast(someDateTime as date) is still sargable. I'm hoping that as time continues, they fix up a bunch of old ones, such as dateadd/datediff with a constant.

So... handling Ancestor is effectively like using the LIKE operator with the start of a string. It doesn't change the order, and you can still get away with stuff.

Rob Farley
Did `LEFT` and `CAST` become sargable in SQL Server 2008? I only have 2005 on my home machine to test with and they definitely force an index scan here. I'll check this out at work tomorrow, this will be interesting if true.
Aaronaught
CAST is sargable in some situations, even in SQL 2005. LEFT still isn't, but LIKE is when the wildcard is at the end, which is essentially providing LEFT functionality.
Rob Farley
I'll edit the answer, to clarify what I mean about 'Left'
Rob Farley
So I tested that with `CAST` today on a `smallint` to `tinyint` and it did indeed result in an index seek. I don't think I've written any queries like that, but it's interesting to know. So you are essentially saying that `hierarchyid` **is** a special type then, that the engine specifically knows how to sarg invocations of `GetAncestor` and `IsDescendantOf`?
Aaronaught
Yup. That's what I'm saying.
Rob Farley
Looks like the evidence is in your favour. Too bad for me, as that implies that there's no way to give hints to the optimizer with respect to how to sarg my own UDT's method or property. Le sigh, maybe in some distant future version.
Aaronaught
Yeah, not something you can set up yourself. Also, I wrote a blogpost at http://msmvps.com/blogs/robfarley/archive/2010/01/22/sargable-functions-in-sql-server.aspx
Rob Farley
A: 

You are correct - HierarchyId and Geometry/Geography are both "magical" types that the Query Optimizer is able to recognize and rewrite the plans for in order to produce optimized queries - it's not as simple as just recognizing sargable operators. There is no way to simulate equivalent behavior with other UDTs.

For HierarchyId, the binary serialization of the type is special in order to represent the hierarchical structure in a binary ordered fashion. It is similar to the mechanism used by the SQL Xml type and described in a research paper ORDPATHs: Insert-Friendly XML Node Labels. So while the QO rules to translate queries that use IsDescendant and GetAncestor are special, the actual underlying index is a regular relational index on the binary hierarchyid data and you could achieve the same behavior if you were willing to write your original queries to do range seeks instead of calling the simple method.

stevehem