views:

548

answers:

3

I'm using MySQL. I have a table which looks like that:

id: primary key
content: varchar
weight: int

What I want to do is randomly select one row from this table, but taking into account the weight. For example, if I have 3 rows:

id, content, weight
1, "some content", 60
2, "other content", 40
3, "something", 100

The first row has 30% chance of being selected, the second row has 20% chance of being selected, and the third row has 50% chance of being selected.

Is there a way to do that ? If I have to execute 2 or 3 queries it's not a problem.

Thanks in advance.

A: 

I don't remember how to RND() in mysql, but here working example for MSSQL:

SELECT TOP(1) (weight +RAND ()) r, id, content, weight FROM Table
ORDER BY 1 DESC

If TOP(1) is not applicable you just fetch first record from total result set.

Dewfy
This way random outweights any weight ;-)
Michael Krelin - hacker
@hacker thanks - just edited
Dewfy
Uhm.. now randomness comes into play only for rows with the highest weight.
Michael Krelin - hacker
@hacker thanks once again
Dewfy
;-) now you add 0 to 1 random to the weight of 1000. That isn't really helpful. The closest you can get at this level of simplicity is if you multply weight by random number then. But whether it really fits the specs needs more thought that I can't give it right now being sick;-)
Michael Krelin - hacker
Okay, it was weight of 100, sorry, but it doesn't really matter ;-)
Michael Krelin - hacker
`SELECT *, weight*random() as o FROM table ORDER BY o DESC LIMIT 1` is what I mean.
Michael Krelin - hacker
`SELECT * FROM table ORDER BY weight*random() DESC LIMIT 1` looks better, shorter and transfers less data ;-)
Michael Krelin - hacker
This doesn't work in SQL Server, sorry. In a SELECT like this, rand() will be treated as a constant expression, and every row will get the same value. You can hack it using something like checksum(newid()), and then normalizing it to the range you're after. Kind of icky but should work.
Cowan
Cowan, the question is tagged `mysql`, anyway ;-) Now that my comment was upvoted, I wonder if I should post my comment as an answer without giving it more thought ;-)
Michael Krelin - hacker
hacker, yes, the question is mysql but Dewfy said that was a 'working example for MSSQL', which it's not. :)
Cowan
Cowan, indeed, then the "working" bit should be edited. "Non-working example" sounds better then ;-)
Michael Krelin - hacker
A: 

Maybe this one:

SELECT * FROM <Table> T JOIN (SELECT FLOOR(MAX(ID)*RAND()) AS ID FROM <Table> ) AS x ON T.ID >= x.ID LIMIT 1;

Or this one:

SELECT * FROM tablename
          WHERE somefield='something'
          ORDER BY RAND() LIMIT 1
StarWind Software
+1  A: 

This works in MSSQL and I am sure that it should be possible to change couple of keywords to make it work in MySQL as well (maybe even nicer):

SELECT      TOP 1 t.*
FROM        @Table t
INNER JOIN (SELECT      t.id, sum(tt.weight) AS cum_weight
            FROM        @Table t
            INNER JOIN  @Table tt ON  tt.id <= t.id
            GROUP BY    t.id) tc
        ON  tc.id = t.id,
           (SELECT  SUM(weight) AS total_weight FROM @Table) tt,
           (SELECT  RAND() AS rnd) r
WHERE       r.rnd * tt.total_weight <= tc.cum_weight
ORDER BY    t.id ASC

The idea is to have a cumulative weight for each row (subselect-1), then find the position of the spanned RAND() in this cumulative range.

van