views:

188

answers:

4

I have a 3 large tables (10k, 10k, and 100M rows) and am trying to do a simple count on a join of them, where all the joined columns are indexed. Why does the COUNT(*) take so long, and how can I speed it up (without triggers and a running summary)?

mysql> describe SELECT COUNT(*) FROM `metaward_alias` INNER JOIN `metaward_achiever` ON (`metaward_alias`.`id` = `metaward_achiever`.`alias_id`) INNER JOIN `metaward_award` ON (`metaward_achiever`.`award_id` = `metaward_award`.`id`) WHERE `metaward_award`.`owner_id` = 8;
+----+-------------+-------------------+--------+-------------------------------------------------------+----------------------------+---------+---------------------------------+------+-------------+
| id | select_type | table             | type   | possible_keys                                         | key                        | key_len | ref                             | rows | Extra       |
+----+-------------+-------------------+--------+-------------------------------------------------------+----------------------------+---------+---------------------------------+------+-------------+
|  1 | SIMPLE      | metaward_award    | ref    | PRIMARY,metaward_award_owner_id                       | metaward_award_owner_id    | 4       | const                           | 1552 |             | 
|  1 | SIMPLE      | metaward_achiever | ref    | metaward_achiever_award_id,metaward_achiever_alias_id | metaward_achiever_award_id | 4       | paul.metaward_award.id          | 2498 |             | 
|  1 | SIMPLE      | metaward_alias    | eq_ref | PRIMARY                                               | PRIMARY                    | 4       | paul.metaward_achiever.alias_id |    1 | Using index | 
+----+-------------+-------------------+--------+-------------------------------------------------------+----------------------------+---------+---------------------------------+------+-------------+
3 rows in set (0.00 sec)

But actually running the query takes about 10 minutes, and I'm on MyISAM so the tables are fully locked down for that duration

A: 

Not familiar with myISAM, so forgive the newbie question. Is there a way to turn on some sort of "showplan" setting there like in Sybase? If so, could you please do so and post the plan here? Thx

DVK
that's what the 'describe' command does
Javier
@DVK InnoDB, MyISAM, Falcon are just a storage engines for MySQL. In other words, if you want to make analogies, make them between Sybase and MySQL, not MyISAM :)
Pascal Thivent
+1  A: 

Make sure you have indexes on:

metaward_alias      id
metaward_achiever   alias_id
metaward_achiever   award_id
metaward_award      id
metaward_award      owner_id

I'm sure many people will also suggest to count on a specific column, but in MySql this doesn't make any difference for your query.

UPDATE:

You could also try to set the condition on the main table instead of one of the joined tables. That would give you the same result, but it could be faster (I don't know how clever MySql is):

SELECT COUNT(*) FROM `metaward_award` 
   INNER JOIN `metaward_achiever` 
      ON (`metaward_achiever`.`award_id` = `metaward_award`.`id`) 
   INNER JOIN `metaward_alias` 
      ON (`metaward_alias`.`id` = `metaward_achiever`.`alias_id`) 
WHERE `metaward_award`.`owner_id` = 8
Philippe Leybaert
worse, a COUNT(*) counts the number of records, while COUNT(field) counts the number of non-null values on that field. on the first case, it can use the index, on the second one, it might have to read each record.
Javier
Yup, I have all those indexes. So I'm stuck with a 10 minute query?
Paul Tarjan
@Javier I was wondering too if a COUNT(1) or COUNT(id) would be more efficient.
Pascal Thivent
Doing your query gives the exact same result with describe.
Paul Tarjan
+2  A: 

I guess the reason is that you do a huge join over three tables (without applying where clause first, the result would be 10k * 10k * 100M = 1016 rows). Try to reorder joins (for example start with metaward_award, then join only metaward_achiever see how long that takes, then try to plug metaward_alias, possibly using subquery to force your preferred evaluation order).

If that does not help you might have to denormalize your data, for example by storing number of aliases for particular metaward_achiever. Then you'd get rid of one join altogether. Maybe you can even cache the sums for metaward_award, depending on how and how often is your data updated.

Other thing that might help is getting all your database content into RAM :-)

che
+1  A: 

10 minutes is way too long for that query. I think you must have a really small key cache. You can get its size in bytes with:

SELECT @@key_buffer_size

First off, you should run ANALYZE TABLE or OPTIMIZE TABLE. They'll sort your index and can slightly improve the performance.

You should also see if you can use more compact types for your columns. For instance, if you're not going to have more than 16 millions owners or awards or aliases, you can change your INT columns into MEDIUMINT (UNSIGNED, of course). Perhaps even SMALLINT in some cases? That will reduce your index footprint and you'll fit more of it in the cache.

Josh Davis
key_buffer_size == 100663296. Is that a good size?
Paul Tarjan
96 MB, depends on how much RAM your server has. If you have free RAM, you should increase it. You should try asking at serverfault.com for appropriate values.
Josh Davis