views:

106

answers:

5

I want to get

id   a    b     c
--------------------
1    1    100   90
6    2    50    100

...from:

id   a    b     c
--------------------
1    1    100   90
2    1    300   50
3    1    200   20
4    2    200   30
5    2    300   70
6    2    50    100

It's the row with the smallest b group by a.

How to do it with sql?

EDIT

I thought it can be achieved by

select * from table group by a having min(b);

which I found later it's wrong.

But is it possible to do it with having statement?

I'm using MySQL

+3  A: 
SELECT t1.*
FROM mytable t1
LEFT OUTER JOIN mytable t2
  ON (t1.a=t2.a AND t1.b>t2.b)
WHERE t2.a IS NULL;

This works because there should be no matching row t2 with the same a and a lesser b.


update: This solution has the same issue with ties that other folks have identified. However, we can break ties:

SELECT t1.*
FROM mytable t1
LEFT OUTER JOIN mytable t2
  ON (t1.a=t2.a AND (t1.b>t2.b OR t1.b=t2.b AND t1.id>t2.id))
WHERE t2.a IS NULL;

Assuming for instance that in the case of a tie, the row with the lower id should be the row we choose.


This doesn't do the trick:

select * from table group by a having min(b);

Because HAVING MIN(b) only tests that the least value in the group is not false (which in MySQL means not zero). The condition in a HAVING clause is for excluding groups from the result, not for choosing the row within the group to return.

Bill Karwin
Happy holidays to you too, Bill. Alas, I can't vote more than once.
OMG Ponies
A: 

You're right. select min(b), a from table group by a. If you want the entire row, then you've use analytics function. That depends on database s/w.

Guru
+1  A: 

Use:

SELECT DISTINCT
       x.*
  FROM TABLE x
  JOIN (SELECT t.a,
               MIN(t.b) 'min_b'
          FROM TABLE T
      GROUP BY t.a) y ON y.a = x.a
                     AND y.min_b = x.b
OMG Ponies
When I try this in MySQL (using the `create table` and `insert` statements from my answer to set up and adjusting table names to match), I get `ERROR 1054 (42S22): Unknown column 't.a' in 'on clause'`.
Asaph
Probably should use `ON y.a=x.a AND y.min_b=x.b` in the join.
Bill Karwin
Oops - too much eggnog, corrected.
OMG Ponies
LOL! Happy Holidays, Ponies. :)
Bill Karwin
+3  A: 

In MySQL:

select t1.* from test as t1
inner join
(select t2.a, min(t2.b) as min_b from test as t2 group by t2.a) as subq
on subq.a=t1.a and subq.min_b=t1.b;

Here is the proof:

mysql> create table test (id int unsigned primary key auto_increment, a int unsigned not null, b int unsigned not null, c int unsigned not null) engine=innodb;
Query OK, 0 rows affected (0.55 sec)

mysql> insert into test (a,b,c) values (1,100,90), (1,300,50), (1,200,20), (2,200,30), (2,300,70), (2,50,100);
Query OK, 6 rows affected (0.39 sec)
Records: 6  Duplicates: 0  Warnings: 0

mysql> select * from test;
+----+---+-----+-----+
| id | a | b   | c   |
+----+---+-----+-----+
|  1 | 1 | 100 |  90 |
|  2 | 1 | 300 |  50 |
|  3 | 1 | 200 |  20 |
|  4 | 2 | 200 |  30 |
|  5 | 2 | 300 |  70 |
|  6 | 2 |  50 | 100 |
+----+---+-----+-----+
6 rows in set (0.00 sec)

mysql> select t1.* from test as t1 inner join (select t2.a, min(t2.b) as min_b from test as t2 group by t2.a) as subq on subq.a=t1.a and subq.min_b=t1.b;
+----+---+-----+-----+
| id | a | b   | c   |
+----+---+-----+-----+
|  1 | 1 | 100 |  90 |
|  6 | 2 |  50 | 100 |
+----+---+-----+-----+
2 rows in set (0.00 sec)
Asaph
Try this again after doing a "insert into test values (7,2,100,95)" and it doesn't work.
Chris Williams
This only works if the MIN(b)'s for each Group a, *never* show up as non-MIN()s in other group-a's. For instance for Table{a,b}(1,200),(2,100),(2,200); it would return all three rows, even though there are only two groups..
RBarryYoung
Why are people upvoting this? It's clearly one of the few incorrect answers here.
RBarryYoung
@RBarryYoung: Duh. Good point. It must be getting late here. I changed my answer. It no longer suffers from that flaw. Kindly confirm and remove your downvote if you agree. Thank you.
Asaph
The data *already* has ties in the b column, they just don't happem to be on the MIN()s. I cannot see how it is at all valid to just *assume* this, with no justification.
RBarryYoung
Sorry, I was still commenting and we crossed. I clearly overdid it... :-(
RBarryYoung
@RBarryYoung: There are no ties within an `a` group, mins or otherwise.
Asaph
It's *across* a-groups that causes the problem. 200 and 300 both occur twice, just not as mins in any group in this dataset.
RBarryYoung
RBarryYoung: There is no problem across groups anymore. I updated my answer. Please confirm and kindly remove your downvote if you agree.
Asaph
A: 

It depends on the implementation, but this is usually faster than the self-join method:

SELECT id, a, b, c
FROM
    (
        SELECT id, a, b, c
        , ROW_NUMBER() OVER(PARTITION BY a ORDER BY b ASC) AS [b IN a]
    ) As SubqueryA
WHERE [b IN a] = 1

Of course it does require that you SQL implementation be fairly up-to-date with the standard.

RBarryYoung
Interesting - that would work on Oracle 9i+ and SQL Server 2005+, but not MySQL.
OMG Ponies
Does the latest version of MySQL not have the ROW_NUMBER() function yet? That's a real shame, its hugely useful (and has been in the standard for a while now).
RBarryYoung
Last I checked, MySQL has no ranking functionality at all :/
OMG Ponies
Bummer. It's my favorite new feature of SQL Server 2005.
RBarryYoung
Right, analytic functions are an outstanding feature request in MySQL: http://bugs.mysql.com/bug.php?id=35893
Bill Karwin
Someone only noticed in 2008?! The `WITH` feature request dates to 2006.
OMG Ponies