tags:

views:

185

answers:

2

I want to create an index on a table and I'm trying to decide the order of the columns in the index. Normally, I go by gut feeling and my knowledge of the data and of the queries the application will typically generate, but in this case looking at the data and the queries I think I need to create and index based on column selectivity. So that the most selective column (the one that narrows further choices down the most) will be the first in the index, followed by the second most selective column etc.

My problem is calculating selectivity of the column i.e. math. I would post it on mathoverflow but I don't think they necessarily know what indexes and columns are. Please help.

+2  A: 

It's a simple ratio for each column:

(Number of Unique Values for the Column) to (Number of Rows In Table)

Calculating the numbers depends on your RDBMS. For SQL Server, you could get the numbers as follows:

SELECT COUNT(DISTINCT ColumnName) FROM TableName

SELECT COUNT(*) FROM TableName
Randolpho
Got it thanks. having played with the numbers a bit, one thing to add: when assessing would-be performance of multi-column indexes, need to calculate selectivity of various _combinations_ of columns as well.
zvolkov
+2  A: 

How to get the selectivity of an index

G Mastros