views:

151

answers:

2

I have a website where visitors can leave comments. I want to add the ability to answer comments (i.e. nested comments).

At first this query was fast but after I populated the table with the existing comments (about 30000) a simple query like:

SELECT c.id, c2.id
  FROM (SELECT id
         FROM swb_comments
         WHERE pageId = 1411
         ORDER BY id DESC
         LIMIT 10) AS c
  LEFT JOIN swb_comments AS c2 ON c.id = c2.parentId

took over 2 seconds, with no childComments(!).

How do I optimize a query like this? On possible solution would be http://www.ferdychristant.com/blog//articles/DOMM-7QJPM7 (scroll to "The Flat Table Model done right") but this makes pagination rather difficult (how do I limit to 10 parent comments within 1 query?)

The table has 3 indexes, id, pageId and ParentId.

Thanks in advance!

EDIT:

Table definition added. This is the full definition with some differences to the above SELECT query, (i.e. pageId instead of numberId to avoid confussion)

CREATE TABLE `swb_comments` (
    `id` mediumint(9) NOT NULL auto_increment,
    `userId` mediumint(9) unsigned NOT NULL default '0',
    `numberId` mediumint(9) unsigned default NULL,
    `orgId` mediumint(9) unsigned default NULL,
    `author` varchar(100) default NULL,
    `email` varchar(255) NOT NULL,
    `message` text NOT NULL,
    `IP` varchar(40) NOT NULL,
    `timestamp` varchar(25) NOT NULL,
    `editedTimestamp` varchar(25) default NULL COMMENT 'last edited timestamp',
    `status` varchar(20) NOT NULL default 'publish',
    `parentId` mediumint(9) unsigned NOT NULL default '0',
    `locale` varchar(10) NOT NULL,
    PRIMARY KEY  (`id`),
    KEY `userId` (`userId`),
    KEY `numberId` (`numberId`),
    KEY `orgId` (`orgId`),
    KEY `parentId` (`parentId`)
  ) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=34748 ;
+1  A: 

The issue is that MySQL cannot apply index if it need to deal with a result from a derived query (that's why you have NULL in the possible_keys column). So I suggest to filter out ten comments that you need:

SELECT * FROM swb_comments WHERE pageId = 1411 ORDER BY id DESC LIMIT 10

And after that send separate request to get answers for each comment id:

SELECT * FROM swb_comments WHERE parentId IN ($commentId1, $commentId2, ..., $commentId10)

In this case database engine will be able to apply pageId and parentId indexes efficiently.

Vitalii Fedorenko
Thank you! I will try this out!
Carl-Fredrik Herö
Normally, I'd downvote this, but having run into countless weird things with the MySQL optimizer that most other vendors don't have, I just have to nod my head in grave agreement.
Joe
A: 

If Mr Fedorenko is correct and the subquery is causing the optimiser difficulties, could you not try...

SELECT c.id, c2.id
    FROM swb_comments c LEFT JOIN swb_comments c2 ON c.id = c2.parentID
    WHERE c.pageId = 1411
    ORDER BY c.id DESC
    LIMIT 10;

and see if it's any improvement?

Later - I have created a table using your definition, filled it in with 30,000 skeletal rows, and tried both the queries. They both complete in too short a time to notice. The explain plans are here...

mysql> EXPLAIN SELECT c.id, c2.id
               FROM swb_comments c LEFT JOIN swb_comments c2 ON c.id = c2.parentID
               WHERE c.numberId = 1411     ORDER BY c.id DESC     LIMIT 10;
+----+-------------+-------+------+---------------+----------+---------+------------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key      | key_len | ref        | rows | Extra                       |
+----+-------------+-------+------+---------------+----------+---------+------------+------+-----------------------------+
|  1 | SIMPLE      | c     | ref  | numberId      | numberId | 4       | const      |    1 | Using where; Using filesort |
|  1 | SIMPLE      | c2    | ref  | parentId      | parentId | 3       | books.c.id |   14 |                             |
+----+-------------+-------+------+---------------+----------+---------+------------+------+-----------------------------+

mysql> EXPLAIN SELECT c.id, c2.id
                   FROM swb_comments c LEFT JOIN swb_comments c2 ON c.id = c2.parentID
                   WHERE c.numberId = 1411     ORDER BY c.id DESC     LIMIT 10;
+----+-------------+-------+------+---------------+----------+---------+------------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key      | key_len | ref        | rows | Extra                       |
+----+-------------+-------+------+---------------+----------+---------+------------+------+-----------------------------+
|  1 | SIMPLE      | c     | ref  | numberId      | numberId | 4       | const      |    1 | Using where; Using filesort |
|  1 | SIMPLE      | c2    | ref  | parentId      | parentId | 3       | books.c.id |   14 |                             |
+----+-------------+-------+------+---------------+----------+---------+------------+------+-----------------------------+

and are exactly what I'd expect.

This is very mysterious.

I'll think about it a bit more to see if there's anything else we can try.

Brian Hooper
This was my original query, but it does not support pagination of parent comments. And this query is in fact much slower, about 32 sec.
Carl-Fredrik Herö
Hmm. That surprises me. What does the EXPLAIN say about it?
Brian Hooper
I've updated http://carl-fredrik.net/explain.html with an EXPLAIN for the above query. I think the "Using where; Using temporary; Using filesort" i the culprit here.
Carl-Fredrik Herö
Yes, it's really not very fond of your indexes, is it? Perhaps we could have a look at your table definition.
Brian Hooper
Have you tried `ANALYZE TABLE`, by the way?
Brian Hooper
The analyze table only says: Table is already up to date. Will give you the table definition tomorrow.
Carl-Fredrik Herö
@Brian The question has been updated to include the table definition.
Carl-Fredrik Herö
Righto. I'll give it a try.
Brian Hooper
Later - have tried it, but am no nearer to solving the problem. A bit of a head-scratcher, this one.
Brian Hooper
I have thought about this a bit more and the only thing I can suggest is that you use mysqldump to create a script, and use the script to create a new table (of slightly different name) and see if the query does better on that.
Brian Hooper
Many many thanks! I will try that and report back. Again, thank you very much!
Carl-Fredrik Herö