tags:

views:

21

answers:

2

Hi,

I'm writing a query where I group a selection of rows to find the MIN value for one of the columns.

I'd also like to return the other column values associated with the MIN row returned.

e.g

ID QTY PRODUCT TYPE

12 Orange Fruit

2 4 Banana Fruit

3 3 Apple Fruit

If I GROUP this table by the column 'TYPE' and select the MIN qty, it won't return the corresponding product for the MIN row which in the case above is 'Apple'.

Adding an ORDER BY clause before grouping seems to solve the problem. However, before I go ahead and include this query in my application I'd just like to know whether this method will always return the correct value. Is this the correct approach? I've seen some examples where subqueries are used, however I have also read that this inefficient.

Thanks in advance.

+1  A: 

Adding an ORDER BY clause before grouping seems to solve the problem. However, before I go ahead and include this query in my application I'd just like to know whether this method will always return the correct value. Is this the correct approach? I've seen some examples where subqueries are used, however I have also read that this inefficient.

No, this is not the correct approach.

I believe you are talking about a query like this:

SELECT  product.*, MIN(qty)
FROM    product
GROUP BY
        type
ORDER BY
        qty

What you are doing here is using MySQL's extension that allows you to select unaggregated/ungrouped columns in a GROUP BY query.

This is mostly used in the queries containing both a JOIN and a GROUP BY on a PRIMARY KEY, like this:

SELECT  order.id, order.customer, SUM(price)
FROM    order
JOIN    orderline
ON      orderline.order_id = order.id
GROUP BY
        order.id

Here, order.customer is neither grouped nor aggregated, but since you are grouping on order.id, it is guaranteed to have the same value within each group.

In your case, all values of qty have different values within the group.

It is not guaranteed from which record within the group the engine will take the value.

You should do this:

SELECT  p.*
FROM    (
        SELECT  DISTINCT type
        FROM    product p
        ) pd
JOIN    p
ON      p.id = 
        (
        SELECT  pi.id
        FROM    product pi
        WHERE   pi.type = pd.type
        ORDER BY
                type, qty, id
        LIMIT 1
        )

If you create an index on product (type, qty, id), this query will work fast.

Quassnoi
Thanks! That was a very informative answer and the query works great.
poleposters
A: 

Hi,

It's difficult to follow you properly without an example of the query you try. From your comments I guess you query something like,

SELECT ID, COUNT(*) AS QTY, PRODUCT_TYPE 
    FROM PRODUCTS
    GROUP BY PRODUCT_TYPE
    ORDER BY COUNT(*) DESC;

My advice, you group by concept (in this case PRODUCT_TYPE) and you order by the times it appears count(*). The query above would do what you want. The sub-queries are mostly for sorting or dismissing rows that are not interested.

The MIN you look is not exactly a MIN, it is an occurrence and you want to see first the one who gives less occurrences (meaning appears less times, I guess).

Cheers,

jms