views:

214

answers:

3

I'd like to paginate through a randomly sorted list of ActiveRecord models (rows from MySQL database).

However, this randomization needs to persist on a per-session basis, so that other people that visit the website also receive a random, paginate-able list of records.

Let's say there are enough entities (tens of thousands) that storing the randomly sorted ID values in either the session or a cookie is too large, so I must temporarily persist it in some other way (MySQL, file, etc.).

Initially I thought I could create a function based on the session ID and the page ID (returning the object IDs for that page) however since the object ID values in MySQL are not sequential (there are gaps), that seemed to fall apart as I was poking at it. The nice thing is that it would require no/minimal storage but the downsides are that it is likely pretty complex to implement and probably CPU intensive.

My feeling is I should create an intersection table, something like:

random_sorts( sort_id, created_at, user_id NULL if guest)

random_sort_items( sort_id, item_id, position )

And then simply store the 'sort_id' in the session. Then, I can paginate the random_sorts WHERE sort_id = n ORDER BY position LIMIT... as usual.

Of course, I'd have to put some sort of a reaper in there to remove them after some period of inactivity (based on random_sorts.created_at).

Unfortunately, I'd have to invalidate the sort as new objects were created (and/or old objects being removed, although deletion is very rare). And, as load increases the size/performance of this table (even properly indexed) drops.

It seems like this ought to be a solved problem but I can't find any rails plugins that do this... Any ideas? Thanks!!

+1  A: 

I'm probably missing something, but wouldn't something like this

select ... order by sha1(concat($session_id,item_id)) limit m,n;

work to give you a random-ordered, repeatable per-session paginated list ? Not very nice on index usage but you avoid any pre-filling / tmp tables / invalidation.

ggiroux
A: 

Personally, to save storage space and sanity, I'd just use a random seed using your user_id.

srand user_id
items.sort_by{ rand }
ghoppe
That would use a lot of ruby memory to sort the entire array, right?
Matt Rogish
I'm a little beyond my depth when it comes to the internals, but since it's just sorting pointers, what would be a more efficient way? Perhaps a custom .shuffle! method with a passed random seed? http://stackoverflow.com/questions/2039902/how-does-rubys-sort-by-rand-work
ghoppe
+3  A: 

MySQL has a RAND function you can use in your ORDER clause, passing a seed tied to the user session.

ORDER BY RAND(?)

Where ? is a seed value from the session. This will give you repeatable ordering across requests.

Toby Hede
Yeah, that works as long as the rows in the table never change (if a new one is added, it's my understanding that the whole set may change). Also, it causes a table scan each time, which can be a big performance hit...
Matt Rogish