views:

165

answers:

4

Good day, I have a question I'm struggling with a lot, hope somebody already found a clever solution to this (I use MySQL).

I have table like this:

Table `log`
----------
id
inserted
message
user_id

My goal is to select last inserted record for user and make this fast. Log table is huge (around 900k records), so my first approach was:

SELECT * FROM `log` 
LEFT JOIN `users` ON `users`.`id` = `log`.`user_id`
WHERE `id` IN 
(
 SELECT MAX(`id`) FROM `log` GROUP BY `user_id`
)

But it seems it calculate subquery for every row (EXPLAIN shows DEPENDENT QUERY). When I split this query for two:

SELECT MAX(`id`) FROM `log` GROUP BY `user_id`

and

SELECT * FROM `log` 
LEFT JOIN `users` ON `users`.`id` = `log`.`user_id`
WHERE `id` IN (....ids from first query...)

It is acceptable to run. Can this be achived by one query?

+4  A: 

How about

SELECT user_id, max(id) FROM `log` GROUP BY user_id

?

This will get you the maximum id for every user in the log table, all in one query!

Dominik
See my answer to speed this up even more.
David Oneill
But what about the field `message`? That's probably supposed to be in the result set as well.
VolkerK
Yes, VolkerK is right, I need other fields (like `message`) as well.
artvolk
+1  A: 

If you always are looking for the log for a particular user, partitioning the log file by user_id would speed things up a lot. If the table is partitioned by user and indexed by id, the query will run very fast.

EDIT: see Dominik's query

David Oneill
A: 

In addition, I would make sure you have an index on user_id.

EDIT: generalized

Ryan
+1  A: 

In addition to using group by to fetch the group-wise maximum you probably want to make it an uncorrelated subquery to fetch additional fields for the specific rows from the table.

SELECT
  la.user_id,la.message
FROM
  `log` as la
INNER JOIN
  (
    SELECT
      user_id, MAX(id) AS maxid
    FROM
      `log`
    GROUP BY
      user_id
    ) as lb
ON
  la.id = lb.maxid

This works best/fastest if you have an index

KEY `foo` (`user_id`,`id`)

but even without that key the performance is descent.

VolkerK
It works! Thank you a lot! This is trully thinking in SQL way!
artvolk
Keep in mind that I'm not a (My)SQL expert. It's completely possible that someone else will present a far better solution (+pointing out issues with this solution). I took the liberty to change the tags in hope to "lure" (even) more experts into this questions ;-)
VolkerK