views:

204

answers:

5

If I have a table with the hypothetical columns foo and bar. bar might have 50-60 distinct values in it. My goal here is to pick say, up to 5 rows for say 6 unique bars. So if the 6 unique bars that get selected out of the 50-60 each happen to has at least 5 rows of data, we'll have 30 rows in total.

A: 

Is this getting called from some program?

If so perhaps you can just lookup the bars, and randomly send them into a select statement.

This way your select could simply be: select * from table where bar in (?,?), and you can move the randomness problem into code, which is frankly better at dealing with that.

Nathan Feger
A: 

I think the easiest way is to use a UNION.

SELECT * FROM table WHERE bar = 'a' LIMIT 5 UNION SELECT * FROM table WHERE bar='b' UNION SEL ....... you get the jist, i hope

EDIT: not sure if this is what you need - you don't say whether this query needs also to somehow determine the bars? or if they are passed in?

benlumley
A: 

Its been a while since I've worked with MySQL (I've been working with MSSQL lately), but two things come to mind:

  • Some sort of self join
  • A Cursor

Self join might look something like

SELECT DISTINCT bar FROM table AS t1 LIMIT 5
   JOIN table AS t2 ON t1.foo = t2.foo

Again, its been a while, so this might not be valid MySQL. Also, you'd get all the foo's back for the 5 bars, so you'd have to figure out how to trim that down.

Jason M
+2  A: 

What you'd really want to do is:

SELECT *
FROM `sometable`
WHERE `bar` IN (
    SELECT DISTINCT `bar`
    FROM `sometable`
    ORDER BY RAND()
    LIMIT 6
)

Unfortunately, you're likely to get this:

ERROR 1235 (42000): This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'

Possibly your version will be more cooperative. Otherwise, you'll probably need to do it as two queries.

chaos
I was going to recommend the same solution, just not so sure about the viability of sub-queries.
David
That query can return more then 5 rows per unique bar, so it's not what he's looking for.
David Grayson
A: 

A simple solution that takes 7 queries:

SELECT distinct bar FROM sometable ORDER BY rand() LIMIT 6

Then, for each of the 6 bar values above, do this, substituting {$bar} for the value, of course:

SELECT foo,bar FROM sometable WHERE bar='{$bar}' ORDER BY rand() LIMIT 5

Be careful about using "ORDER BY rand()" because it might cause MySQL to fetch a LOT of rows from your table, and compute the rand() function for all of them, and then sort them. This can take a long time if you have a big table.

If it does take a long time, then for the first query, you can remove the ORDER BY and the LIMIT clauses, and select 6 random values in your program code after the query is done.

For the second query, you can split it in to two steps:

SELECT count(*) FROM sometable WHERE bar='{$bar}'

Then, in your program code, you know how many items there are so you can randomly choose which of them to look at, and use OFFSET and LIMIT:

SELECT foo,bar FROM sometable WHERE bar='{$bar}' LIMIT 1 OFFSET {$offset}
David Grayson