tags:

views:

580

answers:

2

I need to take the first N rows for each group, ordered by custom column.

Given the following table:

db=# SELECT * FROM xxx;
 id | section_id | name
----+------------+------
  1 |          1 | A
  2 |          1 | B
  3 |          1 | C
  4 |          1 | D
  5 |          2 | E
  6 |          2 | F
  7 |          3 | G
  8 |          2 | H
(8 rows)

I need the first 2 rows (ordered by name) for each section_id, i.e. a result similar to:

 id | section_id | name
----+------------+------
  1 |          1 | A
  2 |          1 | B
  5 |          2 | E
  6 |          2 | F
  7 |          3 | G
(5 rows)

I am using PostgreSQL 8.3.5.

+2  A: 
SELECT  x.*
FROM    (
        SELECT  section_id,
                COALESCE
                (
                (
                SELECT  xi
                FROM    xxx xi
                WHERE   xi.section_id = xo.section_id
                ORDER BY
                        name, id
                OFFSET 1 LIMIT 1
                ),
                (
                SELECT  xi
                FROM    xxx xi
                WHERE   xi.section_id = xo.section_id
                ORDER BY 
                        name DESC, id DESC
                LIMIT 1
                )
                ) AS mlast
        FROM    (
                SELECT  DISTINCT section_id
                FROM    xxx
                ) xo
        ) xoo
JOIN    xxx x
ON      x.section_id = xoo.section_id
        AND (x.name, x.id) <= ((mlast).name, (mlast).id)
Quassnoi
I am getting: ERROR: syntax error at or near "JOIN"
Kouber Saparev
@Kouber: see post update
Quassnoi
The query is very close to the one I need, except that it is not showing sections with less than 2 rows, i.e. the row with ID=7 isn't returned. Otherwise I like your approach.
Kouber Saparev
@Kouber: see post update
Quassnoi
Thank you, I just came to the same solution with COALESCE, but you were faster. :-)
Kouber Saparev
Actually the last JOIN sub-clause could be simplified to: ... AND x.id <= (mlast).idas the ID have already been chosen according to the name field, no?
Kouber Saparev
@Kouber: in this case you'll get the first `2` `id`s, not first `2` names.
Quassnoi
@Kouber: in your example the `name`'s and `id`'s are sorted in same order, so you won't see it. Make the names in reverse order and you will see that these queries yield different results.
Quassnoi
You're right, it cannot be simplified that way.
Kouber Saparev
+1  A: 

Here's another solution.

SELECT
  *
FROM
  xxx a
WHERE (
  SELECT
    COUNT(*)
  FROM
    xxx
  WHERE
    section_id = a.section_id
  AND
    name <= a.name
) <= 2
Kouber Saparev