views:

36

answers:

2

Hi! I'm looking for a way to optimize one SQL query that I have. I'm trying to get how many poems with a certain genre.

Query looks like this:

SELECT
    COUNT(*)  
FROM
    `poems`
WHERE `id` IN (    
                  SELECT `poem_id`
                  FROM `poems_genres`  
                  WHERE `genre_title` = 'derision'
              )
       AND `status` = 'finished';

It takes too long (about 6-10 seconds), because it can't use indexes (because of IN() I think?). Is there a way to rewrite this query in different way to get the same result faster?

+9  A: 

MySQL has a problem with in where it repeatedly re-evaluates uncorrelated sub queries as though they were correlated. Does rewriting as a join improve things?

SELECT
    COUNT(distinct p.`id`)  
FROM `poems` p
JOIN `poems_genres` pg
ON  p.`id` = pg.`poem_id`  
WHERE pg.`genre_title` = 'derision' AND p.`status` = 'finished';

If not then according to this article (see the section "How to force the inner query to execute first") wrapping it up in a derived table might help.

SELECT
    COUNT(*)  
FROM
    `poems`
WHERE `id` IN
(
 select  `poem_id` from ( SELECT `poem_id`
                  FROM `poems_genres`  
                  WHERE `genre_title` = 'derision') x

) AND `status` = 'finished';
Martin Smith
Thanks, a lot of new info for me :)
Silver Light
Rewriting query using JOIN reduced rows examined by a query from 4266806 to 1644. Works as a lightning :)
Silver Light
+1  A: 

You could also use a EXISTS clause and correlate on the id and poem_id fields

SELECT
    COUNT(*)  
FROM
    `poems` p1
WHERE EXISTS
(
 SELECT `poem_id`
                  FROM `poems_genres`  pg
                  WHERE `genre_title` = 'derision'  and p1.id = pg.poem_id

) AND `status` = 'finished';

Difference between this and IN is it will try to use Indexes instead doing full table scan.

Greg