Long story short: if your data are intrinsically UNIQUE
, you will benefit from creating a UNIQIE
index on them.
See the article in my blog for detailed explanation:
Now, the gory details.
As @Mehrdad said, UNIQUENESS
affects the estimated row count in the plan builder.
UNIQUE
index has maximal possible selectivity, that's why:
SELECT *
FROM table1 t2, table2 t2
WHERE t1.id = :myid
AND t2.unique_indexed_field = t1.value
almost surely will use NESTED LOOPS
, while
SELECT *
FROM table1 t2, table2 t2
WHERE t1.id = :myid
AND t2.non_unique_indexed_field = t1.value
may benefit from a HASH JOIN
if the optimizer thinks that non_unique_indexed_field
is not selective.
If your index is CLUSTERED
(i. e. the rows theirselves are contained in the index leaves) and non-UNIQUE
, then a special hidden column called uniquifier
is added to each index key, thus making the key larger and the index slower.
That's why UNIQUE CLUSTERED
index is in fact a little more efficicent than a non-UNIQUE CLUSTERED
one.
In Oracle
, a join on UNIQUE INDEX
is required for a such called key preservation
, which ensures that each row from a table will be selected at most once and makes a view updatable.
This query:
UPDATE (
SELECT *
FROM mytable t1, mytable t2
WHERE t2.reference = t1.unique_indexed_field
)
SET value = other_value
will work in Oracle
, while this one:
UPDATE (
SELECT *
FROM mytable t1, mytable t2
WHERE t2.reference = t1.non_unique_indexed_field
)
SET value = other_value
will fail.
This is not an issue with SQL Server
, though.
One more thing: for a table like this,
CREATE TABLE t_indexer (id INT NOT NULL PRIMARY KEY, uval INT NOT NULL, ival INT NOT NULL)
CREATE UNIQUE INDEX ux_indexer_ux ON t_indexer (uval)
CREATE INDEX ix_indexer_ux ON t_indexer (ival)
, this query:
/* Sorts on the non-unique index first */
SELECT TOP 1 *
FROM t_indexer
ORDER BY
ival, uval
will use a TOP N SORT
, while this one:
/* Sorts on the unique index first */
SELECT TOP 1 *
FROM t_indexer
ORDER BY
uval, ival
will use just an index scan.
For the latter query, there is no point in additional sorting on ival
, since uval
are unique anyway, and the optimizer takes this into account.
On sample data of 200,000
rows (id == uval == ival
), the former query runs for 15
seconds, while the latter one is instant.