views:

43

answers:

6

I have a MySQL query that goes as follows

SELECT
    count(`clicks`.`user_id`) as total,
    `users`.`fullname`
FROM
    `users`,
    `clicks`,
WHERE
    `users`.`id` = `clicks`.`user_id`
GROUP BY
    `clicks`.`user_id`
ORDER BY
    `total` desc
LIMIT
    0,20;

I am running statistics on several button pressing type games. It has a user table and a clicks table. It logs clicks from a specific user. The user can click the button whenever they want. Twenty clicks one day, thirty another, etc. They aren't all in a row.

Currently there are about ~180k clicks for ~2k users. This query takes 1.38 seconds to run on average. I would like to speed it up if that's at all possible.

A: 

Be sure that you've created indexes on users.id and clicks.user_id. You can also try counting the clicks before performing the join, but I suspect that if this would actually improve performance that the engine would do it for you anyway.

VeeArr
A: 

Create an index on clicks.userid for starters. This will make a difference

Peter Tillemans
It will make a difference, but mostly impossible. It's a live table and clicks are added continuously. Indexing a table with that high of traffic will cause a large amount of server load, not to mention slowing the rate of inserts.
Josh K
You only need to do it once and the insert overhead for a simple index like this is not high.
Peter Tillemans
+1  A: 

That query is probably as fast as it gets, provided that you have the columns clicks.user_id and users.id indexed.

One thing about it that I can imagine being responsible for a lot of sluggishness, which is the ORDER BY clause. Seeing that it's an aggregate field it probably has to first get all the data and then sort it afterward without much optimization. Sorting is a good candidate when anything is slow.

Another idea though, is to maintain a separate table that contains the total clicks. If you need those records, then you may end up having to run 2 queries per click... one for the existing table, and another one for updating the user/click table, which would only have user_id and click_count plus whatever else you think is appropriate. That way, SELECTs should become lightning fast, even with lots and lots of users because you're only retrieving the absolute necessary minimum amount of rows as opposed to a whole bunch of them that then only get aggregated anyway.

Helgi Hrafn Gunnarsson
There's nothing you can do about an ORDER BY - it is the only means to reliably return ordered data.
OMG Ponies
@OMG Ponies: For sure, but like anything else, it can be used unwisely in which case it's essential to know whether it's the culprit or not. If it is, there are other table designs that might make sense for his circumstance, but that's a moot point unless he knows what's causing the slowness. It might not even be the ORDER BY clause at all.
Helgi Hrafn Gunnarsson
@Helgi: You can't guarantee order if you don't specify an ORDER BY clause. But an ORDER BY make use of indexes, if they exist...
OMG Ponies
@OMG Ponies: I understand that and I'm not protesting it. All I can do is repeat my last comment.
Helgi Hrafn Gunnarsson
A: 

Try this (untested):

SELECT
    C.total,
    `users`.`fullname`
FROM
    `users`
INNER JOIN
    (SELECT COUNT(*) AS total, user_id 
     FROM 
         `clicks` 
     GROUP BY 
        `user_id` 
     ORDER BY 
        COUNT(*) 
     LIMIT 0,20) C
ON C.user_id = users.user_id

ORDER BY
    C.total desc

Counting the rows first might save you a little time.

Matt
Yeah, that's a good idea too.
Helgi Hrafn Gunnarsson
OMG Ponies
A: 

I just thought of something else. You might get better results by using INNER JOINs.

Untested:

SELECT
    count(`clicks`.`user_id`) as total,
    `users`.`fullname`
FROM
    `users`
    INNER JOIN `clicks` ON `clicks`.`user_id` = `users`.`id`
GROUP BY
    `clicks`.`user_id`
ORDER BY
    `total` desc
LIMIT
    0,20;
Helgi Hrafn Gunnarsson
There's no performance gain to use ANSI-92 syntax vs ANSI-89 (what the OP listed).
OMG Ponies
+1  A: 
  1. Is USERS.id defined as the primary key for the table? It should be...
  2. Assuming INNODB, does CLICKS.user_id have a foreign key constraint on it to associate its values with USERS.id?
  3. USERS.id and CLICKS.user_id are a numeric data type (IE: INT), not text based?
  4. Indexes should be added (if they don't already exist) for:
    • CLICKS.user_id
    • USERS.fullname
  5. If indexes exist, have you tried refreshes the table statistics:

    ANALYZE TABLE USERS;
    ANALYZE TABLE CLICKS;
    
OMG Ponies
Why would you add an index on fullname? It's not used in the join or order clauses.
Matt
@Matt: Because being in the SELECT clause can trigger use if an index exists. However, MySQL only uses one index per SELECT statement (see EXPLAIN output)...
OMG Ponies
`users.id` is defined as the primary key for `users`. There is a separate `id` column for `clicks`. This is MyISAM, not INNODB. They are `ints`. I can't really add indexes to `user_id` on `clicks` but there is already an index on `fullname`.
Josh K