views:

295

answers:

3

I have a table which counts occurrences of one specific action by different users on different objects:

CREATE TABLE `Actions` (
    `object_id` int(10) unsigned NOT NULL,
    `user_id` int(10) unsigned NOT NULL,
    `actionTime` datetime
);

Every time a user performs this action, a row is inserted. I can count how many actions were performed on each object, and order objects by 'activity':

SELECT object_id, count(object_id) AS action_count 
FROM `Actions` 
GROUP BY object_id 
ORDER BY action_count;

How can I limit the results to the top n objects? The LIMIT clause is applied before the aggregation, so it leads to wrong results. The table is potentially huge (millions of rows) and I probably need to count tens of times per minute, so I'd like to do this as efficient as possible.

edit: Actually, Machine is right, and I was wrong with the time at which LIMIT is applied. My query returned the correct results, but the GUI presenting them to me threw me off...this kind of makes this question pointless. Sorry!

A: 
SELECT * FROM (SELECT object_id, count(object_id) AS action_count 
        FROM `Actions` 
        GROUP BY object_id 
        ORDER BY action_count) LIMIT 10;
Gandalf
+1  A: 

How about:

SELECT * FROM
(
SELECT object_id, count(object_id) AS action_count 
FROM `Actions` 
GROUP BY object_id 
ORDER BY action_count
)
LIMIT 15

Also, if you have some measure of what must be the minimum number of actions to be included (e.g. the top n ones are surely more than 1000), you can increase the efficiency by adding a HAVING clause:

SELECT * FROM
(
SELECT object_id, count(object_id) AS action_count 
FROM `Actions` 
GROUP BY object_id 
HAVING action_count > 1000
ORDER BY action_count
)
LIMIT 15
Roee Adler
+1  A: 

Actually... LIMIT is applied last, after a eventual HAVING clause. So it should not give you incorrect results. However, since LIMIT is applied last, it will not provide any faster execution of your query, since a temporary table will have to be created and sorted in order of action count before chopping off the result. Also, remember to sort in descending order:

SELECT object_id, count(object_id) AS action_count 
FROM `Actions` 
GROUP BY object_id 
ORDER BY action_count DESC
LIMIT 10;

You could try adding an index to object_id for optimization. In that way only the index will need to be scanned instead of the Actions table.

PatrikAkerstrand