tags:

views:

139

answers:

5

I have a MySQL table called items that contains thousands of records. Each record has a user_id field and a created (datetime) field.

Trying to put together a query to SELECT 25 rows, passing a string of user ids as a condition and sorted by created DESC.

In some cases, there might be just a few user ids, while in other instances, there may be hundreds.

If the result set is greater than 25, I want to pare it down by eliminating duplicate user_id records. For instance, if there were two records for user_id = 3, only the most recent (according to created datetime) would be included.

In my attempts at a solution, I am having trouble because while, for example, it's easy to get a result set of 100 (allowing duplicate user_id records), or a result set of 16 (using GROUP BY for unique user_id records), it's hard to get 25.

One logical approach, which may not be the correct MySQL approach, is to get the most recent record for each for each user_id, and then, if the result set is less than 25, begin adding a second record for each user_id until the 25 record limit is met (maybe a third, fourth, etc. record for each user_id would be needed).

Can this be accomplished with a MySQL query, or will I need to take a large result set and trim it down to 25 with code?

+1  A: 

I don't think what you're trying to accomplish is possible as a SQL query. Your desire is to return 25 rows, no matter what the normal data groupings are whereas SQL is usually picky about returning based on data groupings.

If you want a purely MySQL-based solution, you may be able to accomplish this with a stored procedure. (Supported in MySQL 5.0.x and later.) However, it might just make more sense to run the query to return all 100+ rows and then trim it programmatically within the application.

Chris Ess
+1  A: 

This will get you the most recent for each user --

SELECT user_id, create
FROM items AS i1
LEFT JOIN items AS i2
ON i1.user_id = i2.user_id AND i1.create > i2.create
WHERE i2.id IS NULL

his will get you the most recent two records for each user --

SELECT user_id, create
FROM items AS i1
LEFT JOIN items AS i2
ON i1.user_id = i2.user_id AND i1.create > i2.create
LEFT JOIN items IS i3 ON i2.user_id = i3.user_id AND i2.create > i3.create
WHERE i3.id IS NULL

Try working from there.

You could nicely put this into a stored procedure.

le dorfier
"æach"? ;-)
Tomalak
Ahh, gold ol' vim and its digraphs. :)
le dorfier
+1  A: 

My opinion is to use application logic, as this is very much application layer logic you are trying to implement at the DB level, i.e. filtering down the results to make the search more useful to the end user.

You could implement a stored procedure (personally I would never do such a thing) or just get the application to decide which 25 results.

karim79
+1  A: 

One approach would be to get the most recent item from each user, followed by the most recent items from all users, and limit that. You could construct pathological examples where this probably isn't what you want, but it should be pretty good in general.

MarkusQ
That's a good simple solution. I can select one from all users, and if it's less than 25, fill in the gap by taking the most recent overall with no regard for dupe users.
chipotle_warrior
+1  A: 

Unfortunately, there is no easy way :( I had to do something similar when I built a report for my company that would pull up customer disables that were logged in a database. Only problem was that the disconnect is ran and logged every 30 minutes. Therefore, the rows would not be distinct since the timestamp was different in every disconnect. I solved this problem with sub queries. I don't have the exact code anymore, but I beleive this is how I implemented it:

SELECT CORP, HOUSE, CUST, 
 (
  SELECT TOP 1 hsd
  FROM #TempTable t2
  WHERE t1.corp = t2.corp
  AND t1.house = t2.house
  AND t1.cust = t2.cust
 ) DisableDate
FROM #TempTable t1
GROUP BY corp, house, cust -- selecting distinct

So, my answer is to elimante the non-distinct column from the query by using sub queries. There might be an easier way to do it though. I'm curious to see what others post.

Sorry, i keep editing this, I keep trying to find ways to make it easier to show what I did.

regex