tags:

views:

87

answers:

4
+2  Q: 

Grouping problem

+3  A: 

You are using non-standard MySQL extension to GROUP BY.

This query in fact reads as "for each distinct value of col3, select the minimal value of col4 along with the values of col1 and col2 from a single row of table having this value of col3 in no particular order"

Like, if we have the following data:

col1 col2 col3 col4
---- ---  ---  ----

A    A    1    1
B    B    1    2
C    C    2    3
D    D    2    4

, this query:

SELECT  col1, col2, col3, MIN(col4)
FROM    mytable
GROUP BY
        col3

will return either of the following:

col1 col2 col3 col4
---- ---  ---  ----
A    A    1    1
C    C    2    3

col1 col2 col3 col4
---- ---  ---  ----
B    B    1    1
C    C    2    3

col1 col2 col3 col4
---- ---  ---  ----
A    A    1    1
D    D    2    3

col1 col2 col3 col4
---- ---  ---  ----
B    B    1    1
D    D    2    3

i. e. it can return any value of col1 and col2 found in the rows that contribute to the corresponding group.

This is equivalent of FIRST_VALUE analytic function, but in no particular order.

Update:

To select values of col1 and col2 corresponding to the minimal value of col4 within each group, use this:

SELECT  col1, co2, col3, col4
FROM    (
        SELECT  col1, col2, col3, col4,
                COALESCE(@col3 = col3, FALSE) AS grp,
                @col3 := col3 AS nv
        FROM    (
                SELECT  @col3 := NULL
                ) vars, table
        WHERE   col2 = 'xxx'
        ORDER BY
                col3, col4
        ) q
WHERE   NOT grp
Quassnoi
Sweet mother of all that is holy that little extension must be rife with the corpses of hopes and dreams. What a dangerous implementation, especially considering the tables aren't stored in any particular order. The only thing that would be reliable on would be if your column had one and only one potential value for all cases. Even then, I'm not sure how much perf gain you'd get.
Eric
@Eric: when I *know* that all values I have will be same for any group (like when I group on a `PRIMARY KEY` of a joined table and want to select non-key values), the need for `GROUP BY` sometimes annoys me. It's a very nice feature when used properly.
Quassnoi
@Eric: Yeah, when you're grouping by the id and you have lots of columns to select, listing them all in the group by is redundant. Now if it would restrict it to cases where you're grouping by a unique/pk that would be nice.
Draemon
Looking at the OP again, I think he wants col1-3 corresponding to min(col4)
Draemon
A: 
 select a.col3, a.col2, a.col1, a.col4
 from table as a natural join 
       (select col3, min(col4) as col4 from table
        where col2='xxx'
        group by col3 ) as b
 where a.col2 = 'xxx' -- sometimes this helps the optimizer even though it's redundant

you can get into a bit of trouble here when there may be multiple rows with the same col3, col4, and col2, but different col1s -- pretty straightforward to fix with rownums and such, but that gets db-specific.

SquareCog
A: 

I guess you want the col1-3 corresponding to the min(col4) for each col3?

Something like:

select X.col1, X.col2, X.col3, X.col4 from table X 
join (select col3, min(col4) as mcol4 from table where col2='xxx' group by col3) as Y
on X.col3=Y.col3 and X.col4=Y.mcol4
where X.col2='xxx';
Draemon
@Draemon: `col2 = 'xxx'` must be in the inner subquery too. Also note that this solution is not duplicate friendly, and we cannot assume a primary key here.
Quassnoi
Yes, you're right (edited). The times I've done this duplicates haven't mattered - I've just been picking the first row. But you're quite right.
Draemon
A: 

Thanks Guys really appreciate your efforts here and all is working well could not have done it without your help.

Thanks Ben

Ben