tags:

views:

676

answers:

7

I have seen random rows pulled using queries like this, which are quite inefficient for large data sets.

SELECT id FROM table ORDER BY RANDOM() LIMIT 1

I have also seen various other RDBMS-specific solutions that don't work with MySQL.

The best thing I can think of doing off-hand is using two queries and doing something like this.

  1. Get the number of rows in the table. MyISAM tables store the row count so this is very fast.
  2. Calculate a random number between 0 and rowcount - 1.
  3. Select a row ordered by primary key, with a LIMIT randnum, 1

Here's the SQL:

SELECT COUNT(*) FROM table;
SELECT id FROM table LIMIT randnum, 1;

Does anyone have a better idea?

+2  A: 

In option 2, you might be able to drop the order by (I don't known) and it shouldn't make any difference aside from speed. Otherwise, that's what I'd do.

Edit OP dropped the "ORDER BY"

BCS
I'll edit that out of the post. Although I would think it shouldn't really matter much anyway, since that column is the primary key, is indexed and the rows are already in that order.
David
+2  A: 

Maybe you could do something like:

SELECT * FROM table 
  WHERE id=
    (FLOOR(RAND() * 
           (SELECT COUNT(*) FROM table)
          )
    );

This is assuming your ID numbers are all sequential with no gaps.

davr
Actually you may want CEIL instead of FLOOR, depends if your ID's start at 0 or 1
davr
That assumes that the expression is cached and not recalculated for every row.
BCS
There are gaps in the primary key, as some rows get deleted.
David
A: 

Add a column containing a calculated random value to each row, and use that in the ordering clause, limiting to one result upon selection. This works out faster than having the table scan that ORDER BY RANDOM() causes.

Update: You still need to calculate some random value prior to issuing the SELECT statement upon retrieval, of course, e.g.

SELECT * FROM `foo` WHERE `foo_rand` >= {some random value} LIMIT 1
Rob
I thought about that. Add a new indexed column and on row creation, assign a random int to it. But the problem with that is I'm storing unnecessary data and you would still have to do something else to actually get a random row out of it, since the random column data is static.
David
A: 

Where I work we often assign a random number to a row as it is being inserted and then order by that field.

Joe Philllips
How does that actually help? You can have your random number column indexed and then order by that column? You'd always get the same row. Otherwise, you're left with doing what I described in the OP anyway.
David
No need to be so rude. You didn't make it clear what you wanted.
Joe Philllips
+1  A: 

The classic "SELECT id FROM table ORDER BY RAND() LIMIT 1" is actually OK.

See the follow excerpt from the MySQL manual:

*If you use LIMIT row_count with ORDER BY, MySQL ends the sorting as soon as it has found the first row_count rows of the sorted result, rather than sorting the entire result.*

igelkott
But it still has to assign a random number to each and every record, doesn't it? I ask because that explanation doesn't make much sense to me: how it is going to return first N sorted rows if the whole resultset is not sorted :S
Damir Zekić
@igelkott, there's still performance issue, I guess it's not OK
Unreality
A: 

Take a look at this link by Jan Kneschke or this SO answer as they both discuss the same question. The SO answer goes over various options also and has some good suggestions depending on your needs. Jan goes over all the various options and the performance characteristics of each. He ends up with the following for the most optimized method by which to do this within a MySQL select:

SELECT name
  FROM random AS r1 JOIN
       (SELECT (RAND() *
                     (SELECT MAX(id)
                        FROM random)) AS id)
        AS r2
 WHERE r1.id >= r2.id
 ORDER BY r1.id ASC
 LIMIT 1;

HTH,

-Dipin

Dipin
A: 

if you don't delete row in this table, the most efficient way is:

(if you know the mininum id just skip it)

1) SELECT MIN(id) AS minId, MAX(id) AS maxId FROM table WHERE 1

2) $randId=mt_rand((int)$row['minId'], (int)$row['maxId']);

3) SELECT id,name,... FROM table WHERE id=$randId LIMIT 1

parm.95