views:

79

answers:

4

Here is the log output:

# Time: 100915 13:06:49
# User@Host: ss[ss] @ localhost []
# Query_time: 13.978355  Lock_time: 0.000029 Rows_sent: 10  Rows_examined: 562760
use ss;
SET timestamp=1284574009;

  SELECT DISTINCT 
         SQL_CALC_FOUND_ROWS 
         B.*, 
         U.username 
    FROM sc_users AS U, 
         sc_bookmarks AS B     
   WHERE B.uId = U.uId 
     AND B.bStatus = 0 
GROUP BY B.bHash 
ORDER BY B.bModified DESC
   LIMIT 10;

and the explain output:

id  select_type table   type    possible_keys   key key_len ref rows    Extra 
----------------------------------------------------------------------------------------------------
1   SIMPLE  U   ALL PRIMARY NULL    NULL    NULL    2   Using temporary; Using filesort
1   SIMPLE  B   ALL sc_bookmarks_usd    NULL    NULL    NULL    187586  Using where; Using join buffer

UPDATE: Here is the create statement for sc_users thanks OMGPonies / BrianHooper :-)

CREATE TABLE `sc_users` (
  `uId` int(11) NOT NULL AUTO_INCREMENT,
  `username` varchar(25) NOT NULL DEFAULT '',
  `password` varchar(40) NOT NULL DEFAULT '',
  `uDatetime` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `uModified` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `name` varchar(50) DEFAULT NULL,
  `email` varchar(50) NOT NULL DEFAULT '',
  `homepage` varchar(255) DEFAULT NULL,
  `uContent` text,
  PRIMARY KEY (`uId`)
)     ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=3 ;

More details (thanks Mark Byers):

This query is generated by the open source project semanticscuttle to generate the home-page. I do not know why the query was written as it was, as I am just getting my feet wet in this project. You can see the source where this sql is generated here though.

I have tried adding various indexes and combined indexes with no luck speeding up the query or removing the file scan. Any tips are appreciated!

UPDATE: The slow-down is caused by the ORDER BY statement... The query bellow runs in 4 seconds with the ORDER BY, 0.01 seconds without it. Any ideas why / how to fix it?

SELECT DISTINCT SQL_CALC_FOUND_ROWS B.*, U.username 
FROM sc_users AS U, sc_bookmarks AS B 
WHERE B.uId = U.uId 
AND B.bStatus = 0 
GROUP BY B.bHash 
ORDER BY B.bModified DESC

UPDATE: NEW SQL and index Quassnoi in an answer bellow... This looked great to me, but actually increased execution time, and didn't get rid of the file sorts:

EXPLAIN SELECT SQL_CACHE b.*, u.username  FROM    (         SELECT  DISTINCT bHash         FROM    sc_bookmarks b         WHERE   bStatus = 0         ) bd JOIN    sc_bookmarks b ON      b.bId =         (         SELECT  bId         FROM    sc_bookmarks bi         WHERE   bi.bStatus = 0                 AND bi.bHash = bd.bHash         ORDER BY                 bStatus DESC, bHash DESC, bModified DESC, bId DESC         LIMIT 1         ) JOIN    sc_users u ON      u.uId = b.uId ORDER BY         bModified ASC LIMIT 10;
+----+--------------------+------------+--------+--------------------------+------------------+---------+----------+--------+---------------------------------+
| id | select_type        | table      | type   | possible_keys            | key              | key_len | ref      | rows   | Extra                           |
+----+--------------------+------------+--------+--------------------------+------------------+---------+----------+--------+---------------------------------+
|  1 | PRIMARY            | <derived2> | ALL    | NULL                     | NULL             | NULL    | NULL     | 187565 | Using temporary; Using filesort |
|  1 | PRIMARY            | b          | eq_ref | PRIMARY,sc_bookmarks_usd | PRIMARY          | 4       | func     |      1 | Using where                     |
|  1 | PRIMARY            | u          | eq_ref | PRIMARY,su_idx           | PRIMARY          | 4       | ss.b.uId |      1 |                                 |
|  3 | DEPENDENT SUBQUERY | bi         | ref    | sc_bookmarks_hui,bStatus | sc_bookmarks_hui | 98      | bd.bHash |      1 | Using where; Using filesort     |
|  2 | DERIVED            | b          | ref    | bStatus                  | bStatus          | 1       |          |  94556 | Using where; Using index        |
+----+--------------------+------------+--------+--------------------------+------------------+---------+----------+--------+---------------------------------+

UPDATE: per the request of Quassnoi, here is the CREATE TABLE statement for sc_bookmarks.

CREATE TABLE `sc_bookmarks` (
 `bId` int(11) NOT NULL AUTO_INCREMENT,
 `uId` int(11) NOT NULL DEFAULT '0',
 `bIp` varchar(40) DEFAULT NULL,
 `bStatus` tinyint(1) NOT NULL DEFAULT '0',
 `bDatetime` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
 `bModified` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
 `bTitle` varchar(255) NOT NULL DEFAULT '',
 `bAddress` varchar(1500) NOT NULL,
 `bDescription` text,
 `bPrivateNote` text,
 `bHash` varchar(32) NOT NULL DEFAULT '',
 `bVotes` int(11) NOT NULL,
 `bVoting` int(11) NOT NULL,
 `bShort` varchar(16) DEFAULT NULL,
 PRIMARY KEY (`bId`),
 KEY `sc_bookmarks_usd` (`uId`,`bStatus`,`bDatetime`),
 KEY `sc_bookmarks_hui` (`bHash`,`uId`,`bId`),
 KEY `sc_bookmarks_du` (`bDatetime`,`uId`),
 KEY `sc_bookmarks_modified_idx` (`bModified`),
 KEY `bStatus` (`bStatus`,`bHash`,`bModified`,`uId`,`bId`),
 KEY `bHash` (`bHash`),
 CONSTRAINT `sc_bookmarks_ibfk_1` FOREIGN KEY (`uId`) REFERENCES `sc_users` (`uId`),
 CONSTRAINT `sc_bookmarks_ibfk_2` FOREIGN KEY (`uId`) REFERENCES `sc_users` (`uId`)
) ENGINE=InnoDB AUTO_INCREMENT=187566 DEFAULT CHARSET=utf8

UPDATE with jacobs feedback on cwiske's answer. The query is well optimized, but it does not return the same results as the original query.

For example I have 3 entries in table "sc_bookmarks".

bId           bModified      bHash
------------------------------------
2000-11-10    aaaaaa         ...
2011-12-12    bbbbbb         ...
2010-11-11    aaaaaa         ...

The query should return only row id 1 and 2. But it return all 3 records. Seems mysql doesn't like: ´GROUP BY B.bModified DESC, B.bHash DESC´

I tried to change ´GROUP BY B.bModified DESC, B.bHash DESC´

to ´GROUP BY B.bModified, B.bHash GROUP BY B.bHash, B.bModified´

but I'm still getting all 3 records.

It works when I leave ´GROUP BY B.bHash´ but that prevents the index from working.

+1  A: 

It might not hurt to run ANALYZE TABLE first:

ANALYZE TABLE sc_users;
ANALYZE TABLE sc_bookmarks;

Let's see if there's improvement if you add the following index:

CREATE INDEX su_idx USING BTREE ON SC_USERS(uid, username) 
OMG Ponies
Thanks for the advice. I added the index, and ran analyze tables, Unfortunately it did not change the run time, or the result of the explain statement.
Eric
+1  A: 

I believe you want to retrieve last modified bookmark for each hash and the user associated with it.

Assuming that the PRIMARY KEY column on sc_bookmarks is called id:

SELECT  b.*, u.username 
FROM    (
        SELECT  DISTINCT bHash
        FROM    sc_bookmarks b
        WHERE   bStatus = 0
        ) bd
JOIN    sc_bookmarks b
ON      b.id =
        (
        SELECT  id
        FROM    sc_bookmarks bi
        WHERE   bi.bStatus = 0
                AND bi.bHash = bd.bHash
        ORDER BY
                bStatus DESC, bHash DESC, bModified DESC, id DESC
        LIMIT 1
        )
JOIN    sc_users u
ON      u.uId = b.uId
ORDER BY
        bModified DESC
LIMIT 10

Create an index on sc_bookmarks (bStatus, bHash bModified, id) for this to work fast.

Quassnoi
Thanks for the ideas. I posted the result of the new explain above. Unfortunately it's still executing 2 file sorts, and is taking a bit longer (I did also create the suggested index)
Eric
+1  A: 

To get optimal performance, the query needs to be as follows:

SELECT U.username, B.*
 FROM sc_users AS U, sc_bookmarks AS B
 WHERE B.uId = U.uId AND B.bStatus = 0
 GROUP BY B.bModified DESC, B.bHash DESC
 ORDER BY B.bModified DESC
LIMIT 10

and you should create a combined index on status+modified+hash.

cweiske
Thanks, I'm going to try this today.
Eric
I will also implement this in SemanticScuttle itself, but that is a bit harder than it seems and will take some time.
cweiske
cweiske, please see the update from jakub, which I also added to my original question. Thanks, -Eric
Eric
+1  A: 

Hi

I try Your query cweiske.

This works great, no "Using temporary" or "Using filesort"

But it return wrong records.

For example I have 3 entries in table "sc_bookmarks".

bId, bModified, bHash, ...

1, 2000-11-10, aaaaaa, ...

2, 2011-12-12, bbbbbb, ...

3, 2010-11-11, aaaaaa, ...

So I expect that Your query return only row id 1 and 2. But it return all 3 records. Looks like mysql don't like "GROUP BY B.bModified DESC, B.bHash DESC"

I try to change "GROUP BY B.bModified DESC, B.bHash DESC" to

  • GROUP BY B.bModified, B.bHash

  • GROUP BY B.bHash, B.bModified

but it newer works ok, still getting 3 records.

It works only when I leave "GROUP BY B.bHash" but in this case index is not working.

Jakub