views:

29

answers:

1

I have some questions on Composite Primary Keys and the cardinality of the columns. I searched the web, but did not find any definitive answer, so I am trying again. The questions are:

Context: Large (50M - 500M rows) OLAP Prep tables, not NOSQL, not Columnar. MySQL and DB2

1) Does the order of keys in a PK matter?

2) If the cardinality of the columns varies heavily, which should be used first. For example, if I have CLIENT/CAMPAIGN/PROGRAM where CLIENT is highly cardinal, CAMPAIGN is moderate, PROGRAM is almost like a bitmap index, what order is the best?

3) What order is the best for Join, if there is a Where clause and when there is no Where Clause (for views)

Thanks in advance.

+1  A: 

1) Does the order of keys in a PK matter?

Yes, it changes the order of the record for the index that is used to police the PRIMARY KEY.

2) If the cardinality of the columns varies heavily, which should be used first. For example, if I have CLIENT/CAMPAIGN/PROGRAM where CLIENT is highly cardinal, CAMPAIGN is moderate, PROGRAM is almost like a bitmap index, what order is the best?

For select queries, this totally depends on the queries you are going to use. If you are searching for all three columns at once, the order is not important; if you are searching for two or one columns, they should be leading in the index.

For inserts, it is better to make the leading column match the order in which the records are inserted.

3) What order is the best for Join, if there is a Where clause and when there is no Where Clause (for views)

Again, this depends on the WHERE clause.

Quassnoi
Thanks, so if I join on CLIENT and CAMPAIGN only, should I be joining on CAMPAIGN (low Cardinality) first then CLIENT (higher Cardinality)
srini.venigalla
@srini: there is no "first" and "last" in a join, you always join on both columns at the same time. These columns should be leading in the index `(client, campaign, program)` for the join to be efficient.
Quassnoi
Sorry, I meant the order for the WHERE clause..
srini.venigalla
@srini: order of predicates in the `WHERE` clause never matters. `campaign = @p1 AND client = @p2` and `client = @p2 AND campaign = @p1` both yield the same plan.
Quassnoi
Thanks again, so..if the predicate order does not matter, could you please elaborate your statement, "depends on the WHERE clause", how the WHERE clause affects the plan. Actually a URL would do..
srini.venigalla
@srini: the plan depends on which columns are used in a join, not in which order.
Quassnoi
@Quassnoi: Thanks, I was reading your other answers - quite an education..But isn't it logical assume, that I would use the most narrowing predicate first (which would be the index of the highest cardinality) and then the next higher cardinality and so on..
srini.venigalla
@srini: optimizer will take care of that automatically.
Quassnoi