tags:

views:

27

answers:

2

Hello, I have two (characteristic_list and measure_list) tables that are related to each other by a column called 'm_id'. I want to retrieve records using filters (columns from characteristic_list) within a date range (columns from measure_list). When I gave the following SQL using INNER JOIN, it takes a while to retrieve the record. What am I doing wrong?

mysql> explain select c.power_set_point, m.value, m.uut_id, m.m_id, m.measurement_status, m.step_name from measure_list as m INNER JOIN characteristic_lis
t as c ON (m.m_id=c.m_id) WHERE (m.sequence_end_time BETWEEN '2010-06-18'  AND '2010-06-20');
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
|  1 | SIMPLE      | c     | ALL  | NULL          | NULL | NULL    | NULL | 82952 |             | 
|  1 | SIMPLE      | m     | ALL  | NULL          | NULL | NULL    | NULL | 85321 | Using where | 
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
2 rows in set (0.00 sec)

mysql> select count(*) from measure_list;
+----------+
| count(*) |
+----------+
|    83635 | 
+----------+
1 row in set (0.18 sec)

mysql> select count(*) from characteristic_list;
+----------+
| count(*) |
+----------+
|    83635 | 
+----------+
1 row in set (0.10 sec)
A: 

The reason this query takes a while to execute is because it has to scan the entire table. You never want to see "ALL" as the type of the query. To speed things up, you need to make smart decisions about what to index.

See the following documents at the MySQL site:

http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html

http://dev.mysql.com/doc/refman/5.1/en/using-explain.html

Dan D.
Thanks for the answer. To follow up, how many index would you recommend with a table of more than 1 million rows. Is it okay to add all columns that are or will be part of the WHERE condition?
cfpete
It really depends. There is a fine line between having just the right number of indexes, and too much. Certainly try creating indexes for the columns included in the WHERE condition (and the ID fields used in the JOIN condition). If you get better performance, and the index file doesn't grow too large, go for it. Query optimization is, in my mind, kind of a "dark art." For me, it's rather hit or miss, and takes an extensive understanding of how databases work.
Dan D.
@cfpete: your question above shows only 85K rows in each table. Is that just a sampling? If that is real, I think it represents only small tables, and you might not have to be so cautious with the indexes.
MJB
A: 

As an add-on to the previous answer by Dan, you should consider indexing the join columns and the where columns. In this case, that means the m_id cols in both tables and the sequence_end_time in the measure_list table. They are small enough that you could add an index, run explain plan and time it, then change the index and compare. Should be relatively quick to solve.

MJB