views:

1264

answers:

5

How do I select a(or some) random row(s) from a table using SQLAlchemy?

A: 

Theres a couple of ways through SQL, depending on which data base is being used.

(I think SQLAlchemy can use all these anyways)

mysql:

SELECT colum FROM table
ORDER BY RAND()
LIMIT 1

PostgreSQL:

SELECT column FROM table
ORDER BY RANDOM()
LIMIT 1

MSSQL:

SELECT TOP 1 column FROM table
ORDER BY NEWID()

IBM DB2:

SELECT column, RAND() as IDX
FROM table
ORDER BY IDX FETCH FIRST 1 ROWS ONLY

Oracle:

SELECT column FROM
(SELECT column FROM table
ORDER BY dbms_random.value)
WHERE rownum = 1

However I don't know of any standard way

Fire Lancer
Yeah. I know how to do it in SQL (I posted that answer in http://beta.stackoverflow.com/questions/19412/how-to-request-a-random-row-in-sql#19568) but was searching for a SQLAlchemy specific solution.
cnu
+5  A: 

This is very much database specifc issue.

I know that PostgreSQL and MySQL has abbility to order by random function, so You can use this in SQLAlchemy:

select.order_by(func.random()) # for PostgreSQL

select.order_by(func.rand()) # for MySQL

Next You need to limit query to amout of records You need (for example using .limit()).

Bear in mind that at least in PostgreSQL selending random record has severe perfomance issues, here is good article about it.

Łukasz
Where is the func module? Is there any docs in SA?
cnu
+1. Same as Postgres works for SQLite: `select.order_by(func.random()).limit(n)`
Adam Bernier
+3  A: 

If you are using the orm and the table is not big (or you have its amount of rows cached) and you want it to be database independent the really simple approach is.

import random
rand = random.randrange(0, session.query(Table).count()) 
row = session.query(Table)[rand]

This is cheating slightly but thats why you use an orm.

David Raznick
rand = random.randrange(0, session.query(Table).count())
Alabaster Codify
yep did not know the idiomatic way, will update
David Raznick
A: 

An enhanced version of Lukasz's example, in the case you need to select multiple rows at random:

import random

# you must first select all the values of the primary key field for the table.
# in some particular cases you can use xrange(session.query(Table).count()) instead
ids = session.query(Table.primary_key_field).all() 
ids_sample = random.sample(ids, 100)

rows = session.query(Table).filter(Table.primary_key_field.in_(ids_sample))

So, this post is just to point out that you can use .in_ to select multiple fields at the same time.

dalloliogm
well, you could at least explain the negative votes.
dalloliogm
A: 

I would suggest you use the Orm Table. SEe post #1.