ansaurus

Question

how to select distinct rows for a column

Answer 1

A:

Use just group by

select id, name, min(observed_value) as minimum_val  from x group by name;

Salil 2010-05-31 06:02:57

This query will result in an error: column id is not part of the group by clause and there is no aggregate function specified for it.

Tommi 2010-05-31 06:09:47

@Tommi Unless the OP is using MySQL

Martin Smith 2010-05-31 08:50:07

@Martin: in `MySQL`, this query will not fail, however, `MIN(observed_value)` and `id` are not guaranteed to originate from the same record (and most probably won't)

Quassnoi 2010-05-31 09:28:43

Answer 2

A:

Your query looks to be too complex for your purpose... just group the query by column name, and use an aggregate function for rest of the columns. For example

SELECT name, min(id), min(observed_value) FROM x GROUP BY name

(Obviously, choose aggregate function other than min if you want to get other values for each name.)

Tommi 2010-05-31 06:08:49

Answer 3

+1 A:

Since you didn't specify which DBMS you are using, I'll provide a couple of solutions:

If you are using a DBMS that has the FIRST() aggregate function, you could use:

SELECT 
  FIRST(id) as id, 
  name, 
  FIRST(observed_value) as observed_value 
FROM x
GROUP BY name;

If you are using MySQL, you could use ORDER BY in conjunction with LIMIT to get something similar to a FIRST() aggregate function.

SELECT
  ( SELECT x2.id 
    FROM x as x2 
    WHERE x2.name = x.name 
    ORDER BY observed_value ASC 
    LIMIT 1
  ) AS id,
  name,
  MIN(observed_value) as observed_value
FROM x
GROUP BY name

Senseful 2010-05-31 06:48:20

Answer 4

+1 A:

SELECT  t.*
FROM    (
        SELECT  DISTINCT name
        FROM    mytable
        ) q
JOIN    mytable t
ON      t.id =
        (
        SELECT  id
        FROM    mytable ti
        WHERE   ti.name = q.name
        ORDER BY
                ti.name, ti.observed_value, ti.id
        LIMIT 1
        )

Create an index on (name, observed_value, id) for this query to be efficient.

Quassnoi 2010-05-31 09:23:54

Thanks for you answer. But when I run this, it is slower than my current query which is in my question descritpion.

Satoru.Logic 2010-05-31 09:35:09

@Satoru: did you create the index I suggested?

Quassnoi 2010-05-31 09:45:36

@Quassnoi: Not yet. But I think my solution will also speed up if I create those indexes, or am I missing something?

Satoru.Logic 2010-05-31 10:00:22

@Satoru: yes it will. Your solution, BTW, is fine, unless it's possible for your model to have duplicates of `observed_value`. For instance, if record `4` had observed value of `100` (instead of `150`), your solution would return two records for `a`, while mine is guaranteed to return only one (in case of a tie, the least id is returned). You may want to read this article in my blog: http://explainextended.com/2009/11/25/mysql-selecting-records-holding-group-wise-maximum-resolving-ties/

Quassnoi 2010-05-31 13:00:53

@Quassnoi: Thanks, your explanation is insightful to me :)

Satoru.Logic 2010-05-31 13:38:33

ansaurus

tags:

views:

answers:

how to select distinct rows for a column

related questions