views:

163

answers:

1

So I started off with this query:

SELECT * FROM TABLE1 WHERE hash IN (SELECT id FROM temptable);

It took forever, so I ran an explain:

mysql> explain SELECT * FROM TABLE1 WHERE hash IN (SELECT id FROM temptable);
+----+--------------------+-----------------+------+---------------+------+---------+------+------------+-------------+
| id | select_type        | table           | type | possible_keys | key  | key_len | ref  | rows       | Extra       |
+----+--------------------+-----------------+------+---------------+------+---------+------+------------+-------------+
|  1 | PRIMARY            | TABLE1          | ALL  | NULL          | NULL | NULL    | NULL | 2554388553 | Using where | 
|  2 | DEPENDENT SUBQUERY | temptable       | ALL  | NULL          | NULL | NULL    | NULL |       1506 | Using where | 
+----+--------------------+-----------------+------+---------------+------+---------+------+------------+-------------+
2 rows in set (0.01 sec)

It wasn't using an index. So, my second pass:

mysql> explain SELECT * FROM TABLE1 JOIN temptable ON TABLE1.hash=temptable.hash;
+----+-------------+-----------------+------+---------------+----------+---------+------------------------+------+-------------+
| id | select_type | table           | type | possible_keys | key      | key_len | ref                    | rows | Extra       |
+----+-------------+-----------------+------+---------------+----------+---------+------------------------+------+-------------+
|  1 | SIMPLE      | temptable       | ALL  | hash          | NULL     | NULL    | NULL                   | 1506 |             | 
|  1 | SIMPLE      | TABLE1          | ref  | hash          | hash     | 5       | testdb.temptable.hash  |  527 | Using where | 
+----+-------------+-----------------+------+---------------+----------+---------+------------------------+------+-------------+
2 rows in set (0.00 sec)

Can I do any other optimization?

+1  A: 

You can gain some more speed by using a covering index, at the cost of extra space consumption. A covering index is one which can satisfy all requested columns in a query without performing a further lookup into the clustered index.

First of all get rid of the SELECT * and explicitly select the fields that you require. Then you can add all the fields in the SELECT clause to the right hand side of your composite index. For example, if your query will look like this:

SELECT  first_name, last_name, age 
FROM    table1 
JOIN    temptable ON table1.hash = temptable.hash;

Then you can have a covering index that looks like this:

CREATE INDEX ix_index ON table1 (hash, first_name, last_name, age);
Daniel Vassallo
@Daniel: I will remove the *. One quick question: If space is not really an issue, is it logical to add an index on all columns?
Legend
@Legend: Yes, in many situations it is recommended. In general the main disadvantage of a covering index over a regular one is that it takes up much more space, and that inserts and updates become a little bit slower (or take more resources). Unless you'll be inserting many new records per second, this is rarely an issue, and is often negligible.
Daniel Vassallo
@Daniel: Thank you. I will try that out and see if it works. My table has 2 billion records but I won't be updating them anymore.
Legend