ansaurus

Question

sql problem,challenge

Answer 1

+3 A:

SELECT t1.*
FROM mytable t1
LEFT OUTER JOIN mytable t2
  ON (t1.a=t2.a AND t1.b>t2.b)
WHERE t2.a IS NULL;

This works because there should be no matching row t2 with the same a and a lesser b.

update: This solution has the same issue with ties that other folks have identified. However, we can break ties:

SELECT t1.*
FROM mytable t1
LEFT OUTER JOIN mytable t2
  ON (t1.a=t2.a AND (t1.b>t2.b OR t1.b=t2.b AND t1.id>t2.id))
WHERE t2.a IS NULL;

Assuming for instance that in the case of a tie, the row with the lower id should be the row we choose.

This doesn't do the trick:

select * from table group by a having min(b);

Because HAVING MIN(b) only tests that the least value in the group is not false (which in MySQL means not zero). The condition in a HAVING clause is for excluding groups from the result, not for choosing the row within the group to return.

Bill Karwin 2009-12-25 06:30:30

Happy holidays to you too, Bill. Alas, I can't vote more than once.

OMG Ponies 2009-12-25 08:49:47

Answer 2

A:

You're right. select min(b), a from table group by a. If you want the entire row, then you've use analytics function. That depends on database s/w.

Guru 2009-12-25 06:34:26

Answer 3

+1 A:

Use:

SELECT DISTINCT
       x.*
  FROM TABLE x
  JOIN (SELECT t.a,
               MIN(t.b) 'min_b'
          FROM TABLE T
      GROUP BY t.a) y ON y.a = x.a
                     AND y.min_b = x.b

OMG Ponies 2009-12-25 06:36:35

When I try this in MySQL (using the `create table` and `insert` statements from my answer to set up and adjusting table names to match), I get `ERROR 1054 (42S22): Unknown column 't.a' in 'on clause'`.

Asaph 2009-12-25 06:48:28

Probably should use `ON y.a=x.a AND y.min_b=x.b` in the join.

Bill Karwin 2009-12-25 06:55:46

Oops - too much eggnog, corrected.

OMG Ponies 2009-12-25 06:58:46

LOL! Happy Holidays, Ponies. :)

Bill Karwin 2009-12-25 08:28:03

Answer 4

+3 A:

In MySQL:

select t1.* from test as t1
inner join
(select t2.a, min(t2.b) as min_b from test as t2 group by t2.a) as subq
on subq.a=t1.a and subq.min_b=t1.b;

Here is the proof:

mysql> create table test (id int unsigned primary key auto_increment, a int unsigned not null, b int unsigned not null, c int unsigned not null) engine=innodb;
Query OK, 0 rows affected (0.55 sec)

mysql> insert into test (a,b,c) values (1,100,90), (1,300,50), (1,200,20), (2,200,30), (2,300,70), (2,50,100);
Query OK, 6 rows affected (0.39 sec)
Records: 6  Duplicates: 0  Warnings: 0

mysql> select * from test;
+----+---+-----+-----+
| id | a | b   | c   |
+----+---+-----+-----+
|  1 | 1 | 100 |  90 |
|  2 | 1 | 300 |  50 |
|  3 | 1 | 200 |  20 |
|  4 | 2 | 200 |  30 |
|  5 | 2 | 300 |  70 |
|  6 | 2 |  50 | 100 |
+----+---+-----+-----+
6 rows in set (0.00 sec)

mysql> select t1.* from test as t1 inner join (select t2.a, min(t2.b) as min_b from test as t2 group by t2.a) as subq on subq.a=t1.a and subq.min_b=t1.b;
+----+---+-----+-----+
| id | a | b   | c   |
+----+---+-----+-----+
|  1 | 1 | 100 |  90 |
|  6 | 2 |  50 | 100 |
+----+---+-----+-----+
2 rows in set (0.00 sec)

Asaph 2009-12-25 06:41:05

Try this again after doing a "insert into test values (7,2,100,95)" and it doesn't work.

Chris Williams 2009-12-25 06:48:04

This only works if the MIN(b)'s for each Group a, *never* show up as non-MIN()s in other group-a's. For instance for Table{a,b}(1,200),(2,100),(2,200); it would return all three rows, even though there are only two groups..

RBarryYoung 2009-12-25 06:50:43

Why are people upvoting this? It's clearly one of the few incorrect answers here.

RBarryYoung 2009-12-25 07:00:12

@RBarryYoung: Duh. Good point. It must be getting late here. I changed my answer. It no longer suffers from that flaw. Kindly confirm and remove your downvote if you agree. Thank you.

Asaph 2009-12-25 07:01:04

The data *already* has ties in the b column, they just don't happem to be on the MIN()s. I cannot see how it is at all valid to just *assume* this, with no justification.

RBarryYoung 2009-12-25 07:02:02

Sorry, I was still commenting and we crossed. I clearly overdid it... :-(

RBarryYoung 2009-12-25 07:02:53

@RBarryYoung: There are no ties within an `a` group, mins or otherwise.

Asaph 2009-12-25 07:04:52

It's *across* a-groups that causes the problem. 200 and 300 both occur twice, just not as mins in any group in this dataset.

RBarryYoung 2009-12-25 09:17:34

RBarryYoung: There is no problem across groups anymore. I updated my answer. Please confirm and kindly remove your downvote if you agree.

Asaph 2009-12-25 09:42:44

Answer 5

A:

It depends on the implementation, but this is usually faster than the self-join method:

SELECT id, a, b, c
FROM
    (
        SELECT id, a, b, c
        , ROW_NUMBER() OVER(PARTITION BY a ORDER BY b ASC) AS [b IN a]
    ) As SubqueryA
WHERE [b IN a] = 1

Of course it does require that you SQL implementation be fairly up-to-date with the standard.

RBarryYoung 2009-12-25 06:44:00

Interesting - that would work on Oracle 9i+ and SQL Server 2005+, but not MySQL.

OMG Ponies 2009-12-25 06:47:21

Does the latest version of MySQL not have the ROW_NUMBER() function yet? That's a real shame, its hugely useful (and has been in the standard for a while now).

RBarryYoung 2009-12-25 06:53:55

Last I checked, MySQL has no ranking functionality at all :/

OMG Ponies 2009-12-25 06:59:33

Bummer. It's my favorite new feature of SQL Server 2005.

RBarryYoung 2009-12-25 07:04:35

Right, analytic functions are an outstanding feature request in MySQL: http://bugs.mysql.com/bug.php?id=35893

Bill Karwin 2009-12-25 08:30:23

Someone only noticed in 2008?! The `WITH` feature request dates to 2006.

OMG Ponies 2009-12-25 08:50:45

ansaurus

tags:

views:

answers:

sql problem,challenge

related questions