ansaurus

Question

SQL return rows in a "round-robin" order

Answer 1

+3 A:

I think you're overcomplicating this. Why not just use

ORDER BY NewID()

Nissan Fan 2009-07-21 17:58:26

Nice! Beautifully simplistic.

Deinumite 2009-07-21 18:01:12

Is that similar to RANDOM()? If it is, it's not quite good enough. There are a couple of sites that comprise 10-30% of the table.

hhaamu 2009-07-21 18:04:42

@hhaamu: I think that you're underestimating the power of RANDOM :

wuub 2009-07-21 18:11:19

Answer 2

+2 A:

See this article in my blog for more detailed explanations on how it works:

PostgreSQL: round-robin order

With new PostgreSQL 8.4:

SELECT  *
FROM    (
        SELECT  site, url, ROW_NUMBER() OVER (PARTITION BY site ORDER BY url) AS rn
        FROM    urls
        )
ORDER BY
        rn, site

With elder versions:

SELECT  site,
        (
        SELECT  url
        FROM    urls ui
        WHERE   ui.site = sites.site
        ORDER BY
                url
        OFFSET  total
        LIMIT   1
        ) AS url
FROM    ( 
        SELECT  site, generate_series(0, cnt - 1) AS total
        FROM    (
                SELECT  site, COUNT(*) AS cnt
                FROM    urls
                GROUP BY
                        site
                ) s
        ) sites
ORDER BY
        total, site

, though it can be less efficient.

Quassnoi 2009-07-21 17:59:19

The last query *really* needs to be checked for efficiency if your table is large.

Quassnoi 2009-07-21 18:16:54

Thanks, it works perfectly! Looks like voodoo to me, but I'll figure it out someday.

hhaamu 2009-07-21 18:16:54

I'm still using 8.3, but the table is only ~150 rows so far.

hhaamu 2009-07-21 18:17:42

@hhaamu: note that the query time for the last query will grow exponentially. I'd test ot on the maximum number of records you are planning to achieve.

Quassnoi 2009-07-21 18:19:44

@Quassnoi: Will do. Very much doubt there will be 1000 rows. And the query won't be run all that often (once a day or more seldom).

hhaamu 2009-07-21 18:27:25

Answer 3

+1 A:

You are asking for round-robin, but I think a simple

SELECT site, url FROM urls ORDER BY RANDOM()

will do the trick. It should work even if urls from the same site are clustered in db.

wuub 2009-07-21 18:03:02

Answer 4

A:

If the URLs don't change very often, you can come up with a somewhat-complicated job that you could run periodically (nightly?) which would assign integers to each record based on the different sites present.

What you can do is write a routine that parses the domain out from a URL (you should be able to find a snippet that does this nearly anywhere).

Then, you create a temporary table that contains each unique domain, plus a number.

Then, for every record in your URLs table, you look up the domain in your temp table, assign that record the number stored there, and add a large number to that temp table's number.

Then for the rest of the day, sort by the number.

Here's an example with the five records you used in your question:

URLs:

Temp table:

example.com       1
stackoverflow.com 2
perl.org          3

Then for each URL, you look up the value in the temp table, and add 3 to it (because it's got 3 distinct records):

Iteration 1:

URLs:

http://www.example.com/some/file         1
http://www.example.com/some/other/file   NULL
http://stackoverflow.com/questions/ask   NULL
http://stackoverflow.com/tags            NULL
http://use.perl.org/                     NULL

Temp table:

example.com       4
stackoverflow.com 2
perl.org          3

Iteration 2:

URLs:

http://www.example.com/some/file         1
http://www.example.com/some/other/file   4
http://stackoverflow.com/questions/ask   NULL
http://stackoverflow.com/tags            NULL
http://use.perl.org/                     NULL

Temp table:

example.com       7
stackoverflow.com 2
perl.org          3

et cetera until you get to

http://www.example.com/some/file         1
http://www.example.com/some/other/file   4
http://stackoverflow.com/questions/ask   2
http://stackoverflow.com/tags            5
http://use.perl.org/                     3

For a lot of records, it's going to be slow. And it will be difficult to work with many inserts/deletions, but the result will be a flawless round-robin ordering.

Welbog 2009-07-21 18:07:37

Answer 5

A:

There is a much simpler and faster solution...

add a sort_order column of type TEXT
add an ON INSERT trigger which sets sort_order to md5( url )
index on sort_order
grab the rows in (sort_order, primary key) order

-> it's very fast and indexed -> rows will come in a repeatable, yet random order

peufeu 2009-07-27 14:01:53

ansaurus

tags:

views:

answers:

SQL return rows in a "round-robin" order

Iteration 1:

Iteration 2:

related questions