A non-clustered index on (a, b)
is a "copy" of a part of the table whose rows are sorted on a
then on b
and contain the reference to the original row.
It helps to run the queries like this:
SELECT *
FROM mytable
WHERE a = @A
AND b = @B
, this:
SELECT *
FROM mytable
ORDER BY
a, b
, this:
SELECT *
FROM mytable
WHERE a = @A
ORDER BY
b
and many others.
For instance, we have a table like this:
# col1 col2 col3
1 1 1 1
2 1 4 8
3 7 2 3
4 3 3 9
5 8 9 4
6 2 2 7
7 5 3 5
8 3 9 4
If we create an index on (col2, col3)
, it will contain the following data:
col2 col3 #
1 1 1
2 3 3
2 7 6
3 5 7
3 9 4
4 8 2
9 4 5
9 4 8
, i. e. sorted first on col2
, then on col3
, then on the reference to the row.
It's easy to see that this index is an index on col2
just as well (sorting on (col2, col3)
implies sorting on col2
alone).
Order matters, so if we create an index on (col3, col2)
, the rows will be sorted differently:
col2 col3 #
1 1 1
2 3 3
9 4 5
9 4 8
3 5 7
2 7 6
4 8 2
3 9 4
This index is an index on col3
too.
If we want to find the rows within a certain range of (col2, col3)
we just take a slice from the ordered data:
SELECT col2, col3
FROM mytable
WHERE col2 BETWEEN 2 AND 3
col2 col3 #
1 1 1
----
2 3 3
2 7 6
3 5 7
3 9 4
----
4 8 2
9 4 5
9 4 8
Easy to see that we cannot take this slice on col3
using this index, since col3
is not ordered by itself.
The "reference" mentioned above is a RID
of the row (a pointer to the place in the tablespace), if the table is non-clustered itself, or the value of the table's cluster key if the table is clustered.
A clustered index does not create a shadow copy of the values. Instead, it rearranges the tables rows themselves.
If you create a clustered index on (col2, col3)
above, it will just rearrange the table rows:
# col1 col2 col3
1 1 1 1
3 7 2 3
6 2 2 7
7 5 3 5
4 3 3 9
2 1 4 8
5 8 9 4
8 3 9 4
Clustered or non-clustered, therefore, is a storage method rather than an index.
In Oracle
, this is called index-organized table
(rows are sorted), as opposed to a heap-organized table
(rows are not sorted).