ansaurus

Question

Oracle: does the column order matter in an index?

Answer 1

+3 A:

if a column is used alone in a query Oracle will use a less efficient skip scan index access path if it is not the leading column in the index
depends on the respective selectivity of the columns
no
I would look at the queries involving the columns in the index and rank their position in the index by most heavily queried

andy.larsen 2010-02-04 00:26:03

Answer 2

+3 A:

You may find answers to some of your questions here: Index Skip Scan – Does Index Column Order Matter Any More? (Warning Sign)

EddieAwad 2010-02-04 00:27:31

Answer 3

+3 A:

If a and b both have 1000 distinct values and they are always queried together then the order of columns in the index doesn't really matter. But a has only 10 distinct values or you have queries which use just one of the columns then it does matter; in these scenarios the index may not be used if the column ordering does not suit the query.
The column with the least distinct values ought to be first and the column with the most distinct values last. This not only maximises the utility of the index it also increases the potential gains from index compression.
The datatype and length of the column have an impact on the return we can get from index compression but not on the best order of columns in an index.
Arrange the columns with the least selective column first and the most selective column last. In the case of a tie lead with the column which is more likely to be used on its own.

The one potential exception to 2. and 3. is with DATE columns. Because Oracle DATE columns include a time element they might have 86400 distinct values per day. However most queries on a data column are usually only interested in the day element, so you might want to consider only the number of distinct days in your calculations. Although I suspect it won't affect the relative selectivity in but a handful of cases.

edit (in response to Nick Pierpoint's comment)

The two main reasons for leading with the least selective column are

Index compression
Index Skip reads

Both these work their magic from knowing that the value in the current slot is the same as the value in the previous slot. Consequently we can maximize the return from these techniques by minimsing the number of times the value changes. In the following example, A has four distinct values and B has six. The dittos represent a compressible value or a skippable index block.

Least selective column leads ...

A          B
---------  -
AARDVARK   1
"          2
"          3
"          4
"          5
"          6
DIFFVAL    1
"          2
"          3
"          4
"          5
"          6
OTHERVAL   1
"          2
"          3
"          4
"          5
"          6
WHATEVER   1
"          2
"          3
"          4
"          5
"          6

Most selective column leads ...

B  A
-  --------
1  AARDVARK
"  DIFFVAL
"  OTHERVAL
"  WHATEVER
2  AARDVARK
"  DIFFVAL
"  OTHERVAL
"  WHATEVER
3  AARDVARK
"  DIFFVAL
"  OTHERVAL
"  WHATEVER
4  AARDVARK
"  DIFFVAL
"  OTHERVAL
"  WHATEVER
5  AARDVARK
"  DIFFVAL
"  OTHERVAL
"  WHATEVER
6  AARDVARK
"  DIFFVAL
"  OTHERVAL
"  WHATEVER

Even in this trival example, (A, B) has 20 skippable slots compared to the 18 of (B, A). A wider disparity would generate greater ROI on index compression or better utility from Index Skip reads.

As is the case with most tuning heuristics we need to benchmark using actual values and realistic volumes. This is definitely a scenario where data skew could have a dramatic impact of the effectiveness of different approaches.

APC 2010-02-04 07:44:59

Hi. I'm not clear on your point 4. Can you explain? In general I'd put the *most* selective column first. I'd only put the least selective column first when I thought a histogram might usefully lead the CBO to skip the index entirely.

Nick Pierpoint 2010-02-04 13:18:24

Thanks for the additional edit APC - made your point clearly. Your comment about then need to benchmark is well made. I think if you have a highly selective first index then - from a performance perspective - you'll do well to put it first. Benchmark... benchmark... benchmark...

Nick Pierpoint 2010-02-04 22:49:46

ansaurus

tags:

views:

answers:

Oracle: does the column order matter in an index?

related questions