This sounds like clustered key is added to (all) indermediate nodes of non-unique non-clustered index. And by the same logic RIDs are added to indermediate nodes in case of non-clustered table (?)
Yes, this is true.
This is done to improve the maintainability of an index.
Say, you have a secondary (non-clustered) index on column
, 1,000,000
records with column = 1
and want to delete one of these records.
The record needs to be deleted from the index as well.
To locate the record to be deleted, a B-Tree
search should be performed on the index. But if the branch nodes were not storing the value of the row pointer (be it a clustered key or a RID
), the engine would have to scan all 1M
records to determine the one to delete.
If the secondary key were UNIQUE
, the value of the column
would be enough to uniquely locate the node in the index, so storing the row pointer in the branch nodes is not required (and that's why they are not stored).
This discussion may be also interesting to you:
http://www.sqlservercentral.com/Forums/Topic714684-1545-6.aspx
Update:
To check the contents of the branch nodes, you can use DBCC IND
:
CREATE TABLE t_clustered (id INT NOT NULL PRIMARY KEY, nval INT, uval INT)
CREATE TABLE t_nonclustered (id INT NOT NULL PRIMARY KEY NONCLUSTERED, nval INT, uval INT)
CREATE INDEX ix_clustered_nval ON t_clustered (nval)
CREATE UNIQUE INDEX ux_clustered_uval ON t_clustered (uval)
CREATE INDEX ix_nonclustered_nval ON t_nonclustered (nval)
CREATE UNIQUE INDEX ux_nonclustered_nval ON t_nonclustered (uval)
;
WITH q(id) AS
(
SELECT 1
UNION ALL
SELECT id + 1
FROM q
WHERE id < 10000
)
INSERT
INTO t_clustered
SELECT id, (id - 1) / 10 + 1, id
FROM q
OPTION (MAXRECURSION 0)
;
WITH q(id) AS
(
SELECT 1
UNION ALL
SELECT id + 1
FROM q
WHERE id < 10000
)
INSERT
INTO t_nonclustered
SELECT id, (id - 1) / 10 + 1, id
FROM q
OPTION (MAXRECURSION 0)
-- Replace mydb with your database name
DBCC IND (mydb, t_clustered, -1)
DBCC IND (mydb, t_nonclustered, -1)
In the output of these commands you should search for records with PageType = 2
(index page) and IndexLevel > 0
(non-leaf node) and find their PageID
.
In my case, I got the following PageID
: 21074, 21076, 21105, 21107. Note they are site specific: you will have the other values.
Then you should used DBCC PAGE
to examine the contents of these pages:
DBCC PAGE (mydb, 1, 21074, 3)
DBCC PAGE (mydb, 1, 21076, 3)
DBCC PAGE (mydb, 1, 21105, 3)
DBCC PAGE (mydb, 1, 21107, 3)
FileId PageId Row Level ChildFileId ChildPageId nval (key) id (key) KeyHashValue
FileId PageId Row Level ChildFileId ChildPageId uval (key) KeyHashValue
FileId PageId Row Level ChildFileId ChildPageId nval (key) HEAP RID (key) KeyHashValue
FileId PageId Row Level ChildFileId ChildPageId uval (key) KeyHashValue
We see that the non-leaf nodes of the non-unique secondary index on nval
contain record pointers (id (PRIMARY KEY CLUSTERED)
and RID
, appropriately), while those of the unique index on uval
do not contain record pointers, only the values on the indexed column itself.
This is, again, because with a unique index the value of the column indexed is sufficient to locate its node in the index, while with a non-unique index it's not.