views:

809

answers:

3

Hi,

I run a website where users can post items (e.g. pictures). The items are stored in a MySQL database.

I want to query for the last ten posted items BUT with the constraint of a maximum of 3 items can come from any single user.

What is the best way of doing it? My preferred solution is a constraint that is put on the SQL query requesting the last ten items. But ideas on how to set up the database design is very welcome.

Thanks in advance!

BR

+1  A: 

This is difficult because MySQL does not support the LIMIT clause on sub-queries. If it did, this would be rather trivial... But alas, here is my naïve approach:

SELECT
  i.UserId,
  i.ImageId
FROM
  UserSuppliedImages i
WHERE
  /* second valid ImageId */
  ImageId = (
    SELECT MAX(ImageId)
    FROM UserSuppliedImages
    WHERE UserId = i.UserId
  )
  OR
  /* second valid ImageId */
  ImageId = (
    SELECT MAX(ImageId)
    FROM   UserSuppliedImages
    WHERE UserId = i.UserId
      AND ImageId < (
        SELECT MAX(ImageId)
        FROM UserSuppliedImages
        WHERE UserId = i.UserId
      )
    )
  /* you get the picture... 
     the more "per user" images you want, the more complex this will get */
LIMIT 10;

You did not comment on having a preferred result order, so this selects the latest images (assuming ImageId is an ascending auto-incrementing value).

For comparison, on SQL Server the same would look like this:

SELECT TOP 10
  img.ImageId,
  img.ImagePath,
  img.UserId
FROM
  UserSuppliedImages img
WHERE
  ImageId IN (
    SELECT TOP 3 ImageId
    FROM UserSuppliedImages 
    WHERE UserId = img.UserId
  )
Tomalak
A: 

I would first select 10 distinct users, then selecting images from each of those users with a LIMIT 3, possibly by a union of all those and limit that to 10.

That would atleast narrow down the data you need to process to a fair amount.

jishi
+1  A: 

It's pretty easy with a correlated sub-query:

SELECT `img`.`id` , `img`.`userid`
FROM `img`
WHERE 3 > (
SELECT count( * )
FROM `img` AS `img1`
WHERE `img`.`userid` = `img1`.`userid`
AND `img`.`id` > `img1`.`id` )
ORDER BY `img`.`id` DESC
LIMIT 10

The query assumes that larger id means added later

Correlated sub-queries are a powerful tool! :-)

Incidently
Can you express in plain English how this works? I don't get it.
Tomalak
The subquery will eliminate any rows for which there are already three rows with the same userid and a lower image id. I think the greater than symbol in the subquery might be backwards though.
Tom H.
This is an example of SQL being more expressive than plain English :-)Tom H. gave a good explanation; and yes, I'm sure the greater than symbol is correct
Incidently
Ah. Now I understand. And the greater than is in fact correct. This is a lot better than mine. +1
Tomalak
@Niklas: For the fun of it - can you test my solution in regard to performance with your data? In terms of maintainability, Incidently's solution is the clear winner. But I have tested both on SQL Server, on a good amount of data, and mine feels a lot faster. YMMV, I'm interested what you find!
Tomalak
Wow. I'm totally favoriting this question, just so I can show people this query.
Jarett
Won't this exclude all those who have more than 3 items rather than simply limit the number fetched from them to 3?
philistyne