How do I select a(or some) random row(s) from a table using SQLAlchemy?
Theres a couple of ways through SQL, depending on which data base is being used.
(I think SQLAlchemy can use all these anyways)
mysql:
SELECT colum FROM table
ORDER BY RAND()
LIMIT 1
PostgreSQL:
SELECT column FROM table
ORDER BY RANDOM()
LIMIT 1
MSSQL:
SELECT TOP 1 column FROM table
ORDER BY NEWID()
IBM DB2:
SELECT column, RAND() as IDX
FROM table
ORDER BY IDX FETCH FIRST 1 ROWS ONLY
Oracle:
SELECT column FROM
(SELECT column FROM table
ORDER BY dbms_random.value)
WHERE rownum = 1
However I don't know of any standard way
This is very much database specifc issue.
I know that PostgreSQL and MySQL has abbility to order by random function, so You can use this in SQLAlchemy:
select.order_by(func.random()) # for PostgreSQL
select.order_by(func.rand()) # for MySQL
Next You need to limit query to amout of records You need (for example using .limit()).
Bear in mind that at least in PostgreSQL selending random record has severe perfomance issues, here is good article about it.
If you are using the orm and the table is not big (or you have its amount of rows cached) and you want it to be database independent the really simple approach is.
import random
rand = random.randrange(0, session.query(Table).count())
row = session.query(Table)[rand]
This is cheating slightly but thats why you use an orm.
An enhanced version of Lukasz's example, in the case you need to select multiple rows at random:
import random
# you must first select all the values of the primary key field for the table.
# in some particular cases you can use xrange(session.query(Table).count()) instead
ids = session.query(Table.primary_key_field).all()
ids_sample = random.sample(ids, 100)
rows = session.query(Table).filter(Table.primary_key_field.in_(ids_sample))
So, this post is just to point out that you can use .in_ to select multiple fields at the same time.