ansaurus

Question

Answer 1

+1 A:

The performance tradeoff would basically be what you described:

If you modified the structure to store a rank, queries would be very, very simple and fast. However, this would require some overhead any time "points" changed, as you'd have to verify that the rank hasn't changed. If the ranking had changed, you'd have to do multiple updates.

This causes more work (with the potential for bugs) at every update/insert. The tradeoff is very fast reads. If you're typical usage is very few modifications compared to millions of reads, AND you found this query to be a bottleneck, it might be worth considering reworking this. However, I would avoid the added complexity and maintainability headaches unless you truly found this to be a problem, since the current solution requires less storage, and is very flexible.

Reed Copsey 2009-08-18 16:24:17

Answer 2

A:

Does not the 'where' portion of that query internally require reading the entire table? I understand about premature optimization. Academically, it seems that this wouldn't scale further than a few thousand rows.

Dustin Getz 2009-08-18 16:30:08

Answer 3

+1 A:

The link you reference is a MySQL question. If the original database had been Oracle the accepted answer would be to use an analytic function, which does scale, very nicely:

SQL> select id, name, points from users order by id
  2  /

        ID NAME           POINTS
---------- ---------- ----------
         1 john             4635
         3 tom              7364
         4 bob               234
         6 harry            9857
         8 algernon            1
         9 sebastian         234
        10 charles           888

7 rows selected.

SQL> select name, id, points, rank() over (order by points)
  2  from users
  3  /

NAME               ID     POINTS RANK()OVER(ORDERBYPOINTS)
---------- ---------- ---------- -------------------------
algernon            8          1                         1
bob                 4        234                         2
sebastian           9        234                         2
charles            10        888                         4
john                1       4635                         5
tom                 3       7364                         6
harry               6       9857                         7

7 rows selected.

SQL> select name, id, points, dense_rank() over (order by points desc)
  2  from users
  3  /

NAME               ID     POINTS DENSE_RANK()OVER(ORDERBYPOINTSDESC)
---------- ---------- ---------- -----------------------------------
harry               6       9857                                   1
tom                 3       7364                                   2
john                1       4635                                   3
charles            10        888                                   4
bob                 4        234                                   5
sebastian           9        234                                   5
algernon            8          1                                   6

7 rows selected.

SQL>

APC 2009-08-18 17:21:27

ansaurus

tags:

views:

answers:

db optimization: computing rank

related questions