views:

199

answers:

4

I'm kinda ashamed of asking this since I've been working with MySQL for years, but oh well.

I have a table with two fields, a and b. I will be running the following queries on it:

  • SELECT * FROM ... WHERE A = 1;
  • SELECT * FROM ... WHERE B = 1;
  • SELECT * FROM ... WHERE A = 1 AND B = 1;

From the performance point of view, is at least one of the following configurations of indexes slower for at least one query? If yes, please elaborate.

  1. ALTER TABLE ... ADD INDEX (a); ALTER TABLE ... ADD INDEX (b);
  2. ALTER TABLE ... ADD INDEX (a, b);
  3. ALTER TABLE ... ADD INDEX (a); ALTER TABLE ... ADD INDEX (b); ALTER TABLE ... ADD INDEX (a, b);

Thanks (note that we are talking about non unique indexes)

+3  A: 

Yes, at least one case is considerably slower. If you only define the following index:

ALTER TABLE ... ADD INDEX (a, b);

... then the query SELECT * FROM ... WHERE B = 1; will not use that index.

When you create an index with a composite key, the order of the columns of the key is important. It is recommended to try to order the columns in the key to enhance selectivity, with the most selective columns to the left-most of the key. If you don't do this, and put a non-selective column as the first part of the key, you risk not using the index at all. (Source: Tips on Optimizing SQL Server Composite Index)

Daniel Vassallo
And is there any advantage of going with configuration #3 rather than #1 ?
Andreas Bonini
Yes, there could be an advantage. A composite index may become a covering index, when all the data in the query is held within the index itself. In general, `SELECT A, B FROM ... WHERE A = 1;`, will be faster if it uses the composite index on `(A, B)`, since the query will not need to fetch any data from the table. The data is already in the index.
Daniel Vassallo
Remember that indexes take space, and may compromise a little performance on inserts and updates. I'd go with option 3 only if your application will be doing queries that justify the extra index... Tips on a covering index: http://www.sql-server-performance.com/tips/covering_indexes_p1.aspx
Daniel Vassallo
@Daniel: Yes, but even if `(A, B)` is not a covering index for the query, it will be a lot faster when the two sets returned from the individual indexes are large, as a huge set union operation can sometimes be more expensive than a full table scan.
Henning
@Henning: Good point. When selecting `a = 1 and b = 1`, option 3 could be faster even if the index is not a covering index.
Daniel Vassallo
Many optimizers will be able to use an index on (a,b) to answer both the 'WHERE a = 1' and the 'WHERE a = 1 AND b = 2' type queries; indeed, in many systems, the single-column index on b would be superfluous unless the (a,b) index was non-unique and the (b) index was unique - which is explicitly not the case here.
Jonathan Leffler
+1  A: 

It's very improbable that mere existence of an index slow down a SELECT query: it just won't be used.

In theory the optimizer can incorrectly choose more long index on (a, b) rather than one on (a) to serve the query which searches only for a.

In practice, I've never seen it: MySQL usually does the opposite mistake, taking a shorter index when a longer one exists.

Update:

In your case, either of the following configurations will suffice for all queries:

(a, b); (b)

or

(b, a); (a)

MySQL can also use two separate indexes with index_intersect, so creating these indexes

(a); (b)

will also speed up the query with a = 1 AND b = 1, though to a lesser extent than any of the solutions above.

You may also want to read this article in my blog:

Update 2:

Seems I finally understood your question :)

ALTER TABLE ... ADD INDEX (a); ALTER TABLE ... ADD INDEX (b);

Excellent for a = 1 and b = 1, reasonably good for a = 1 AND b = 1

ALTER TABLE ... ADD INDEX (a, b);

Excellent for a = 1 AND b = 1, almost excellent for a = 1, poor for b = 1

ALTER TABLE ... ADD INDEX (a); ALTER TABLE ... ADD INDEX (b); ALTER TABLE ... ADD INDEX (a, b);

Excellent for all three queries.

Quassnoi
Well, yes, but why add unnecessary indexes?
Andreas Bonini
I think you are misinterpreting the OP. I thought he was asking "Is one of these slower than the others?" not "Is one of these slower than having no indexes at all?"
David Oneill
+1  A: 

SQL will choose the index that best covers the query. An index on A, B will cover the query for both case 1 and 3, but not for 2 (since the primary index column is A)

So to cover all three queries you need two indexes:

ALTER TABLE ... ADD INDEX (a, b); ALTER TABLE ... ADD INDEX (b)
jjacka
A: 

For the example you have index set #3 is optimal. Mysql will choose the single A and B indices for single column where clauses, and use the compound index for the A & B where clause.

netricate