views:

56

answers:

4

Recently I had this question, and everything worked properly until I sent it to my server at DreamHost.

The query bellow seems to take too long to execute and I can't figure out why so many rows are processed at once. In my local server the same query was executed in 0.3 seconds.

SELECT feed_entries . *
FROM feed_entries
WHERE 
id
IN (

SELECT e.id
FROM feed_entries AS e
INNER JOIN feeds AS f ON e.feed_id = f.id
INNER JOIN entries_categorias AS ec ON ec.entry_id = e.id
INNER JOIN categorias AS c ON ec.categoria_id = c.id
WHERE 
e.deleted =0
AND 
c.slug
IN ('entenda', 'google')
GROUP BY e.id
HAVING COUNT( DISTINCT ec.id ) =2

)
ORDER BY date DESC
LIMIT 1

This is the message I've received from the support team:

Well, it looks like that the query is locked up due to these other queries like this one, processing 6.5 million records, for 11 seconds.

# Query_time: 11.639269  Lock_time: 0.000192 Rows_sent: 2  Rows_examined:
6509098
use desenvolvimentistas;
SET timestamp=1280325753;
SELECT `e`.*, `f`.`titulo` AS `feedTitulo`, `f`.`url` AS `feedUrl` FROM
`feed_entries` AS `e`
 INNER JOIN `feeds` AS `f` ON e.feed_id =f.id WHERE (e.id IN (SELECT
`e`.`id` FROM `feed_entries` AS `e`
 INNER JOIN `feeds` AS `f` ON e.feed_id =f.id
 INNER JOIN `entries_categorias` AS `ec` ON ec.entry_id =e.id
 INNER JOIN `categorias` AS `c` ON ec.categoria_id =c.id WHERE (e.deleted
=0) AND (e.id NOT IN ('', '', '842', '853', '774', '878')) AND
(e.imagem145x145 =1) AND (c.slug IN('destaque-2')) GROUP BY `e`.`id`
HAVING (COUNT(DISTINCT ec.id) =1))) ORDER BY `e`.`date` DESC LIMIT 4;

Likely the query takes .3 seconds to run on your local machine that is idle, but it is slow on our servers because it is running the query 150 at a time, and each one is processing 15.3 million records.

I had our mysql admin took a look at it and he said it was quite inefficient and was surprised that the database was even accessible because the queries had it so locked up. He said the database will be disabled if it continues this way and causes server problems. He said you should fix the rows examined or get a MySQL PS, although a mysql ps will not fix it, but instead will only prevent the database from being disabled.

Here is the query:

# Query_time: 25.944779  Lock_time: 0.000176 Rows_sent: 0  Rows_examined:
15378209
use desenvolvimentistas;
SELECT `feed_entries`.* FROM `feed_entries` WHERE (id IN (SELECT `e`.`id`
FROM `feed_entries` AS `e` INNER JOIN `feeds` AS `f` ON e.feed_id =f.id
INNER JOIN `entries_categorias` AS `ec` ON ec.entry_id =e.id INNER JOIN
`categorias` AS `c` ON ec.categoria_id =c.id WHERE (e.deleted =0) AND
(c.slug IN('entenda','google')) GROUP BY `e`.`id` HAVING (COUNT(DISTINCT
ec.id) =2))) ORDER BY `date` DESC LIMIT 1;

Here is another that needs to be fixed:

# Query_time: 27.010857  Lock_time: 0.000165 Rows_sent: 0  Rows_examined:
15382750
use desenvolvimentistas;
SET timestamp=1280325706;
SELECT `e`.*, `f`.`titulo` AS `feedTitulo`, `f`.`url` AS `feedUrl` FROM
`feed_entries` AS `e`
 INNER JOIN `feeds` AS `f` ON e.feed_id =f.id WHERE (e.id IN (SELECT
`e`.`id` FROM `feed_entries` AS `e`
 INNER JOIN `feeds` AS `f` ON e.feed_id =f.id
 INNER JOIN `entries_categorias` AS `ec` ON ec.entry_id =e.id
 INNER JOIN `categorias` AS `c` ON ec.categoria_id =c.id WHERE (e.deleted
=0) AND (c.slug IN('manchete', 'google')) GROUP BY `e`.`id` HAVING
(COUNT(DISTINCT ec.id) =2))) ORDER BY `e`.`date` DESC LIMIT 4;7:18
+1  A: 

Well the first thing I would check is that you have the correct indexes on the hosted server. Sometimes, people forget to move these when migrating datbases.

HLGEM
A: 

you can always cache it ;-)

liu
I already have cache. But this error need to be fixed, my table has no more than a thousand rows. He said that 15 millions was processed in one query.
Keyne
A: 

I'd try to create a single query out of the inner one. Then add the resulting ids manually into the outer on (where id in (1,2,3,4,5)). While this doesn't automatically improve performance, you at least know what's going on.

The way they describe the problem, it might even be the inner loop is tested for each single record checked from the outer loop, which would be fixed by this approach. Although I'd assume that MySQL would be smart enough not handle this by itself.

Nicolas78
+1  A: 

Ideally, one query per question please - there's no limit to the number of questions you can ask, and it minimizes chatter associated with each individual query.

I re-wrote your first query as:

  SELECT t.*
    FROM feed_entries t
   WHERE EXISTS(SELECT NULL
                  FROM feed_entries AS e
                  JOIN feeds AS f ON e.feed_id = f.id
                  JOIN entries_categorias AS ec ON ec.entry_id = e.id
                  JOIN categorias AS c ON ec.categoria_id = c.id
                 WHERE e.deleted = 0
                   AND c.slug IN ('entenda', 'google')
                   AND e.id = t.id
              GROUP BY e.id
                HAVING COUNT( DISTINCT ec.id ) = 2)
ORDER BY date DESC
   LIMIT 1

Index the columns used in the JOIN criteria at a minimum, if not already.

I saw a comment about a query being run 150 times - could you elaborate?

OMG Ponies
Would it be a good suggestion to have an index on 'date' too, given that he orders by it but only takes the top result?
Neil Trodden
@Neil Trodden: Yes, but [MySQL has a limit on space for allocating indexes (1,000 for MyISAM, 767 for InnoDB)](http://dev.mysql.com/doc/refman/5.0/en/create-index.html) so JOIN criteria is my first priority, then seeing what I can do with covering indexes.
OMG Ponies
Thanks, I've reduced the number of 150 queries and defined the indexes. Also have discovered other mistakes. Now it's ok.
Keyne