tags:

views:

105

answers:

6

Given a command in SQL;

SELECT ...
    FROM ...
    GROUP BY ...

Can I group by something that isn't in the SELECT line?

+3  A: 

Yes of course e.g.

select 
   count(*)
from  
   some_table_with_updated_column
group by 
   trunc(updated, 'MM.YYYY') 
Aleksi
+2  A: 

Yes, you can. Example:

select count(1) 
  from sales
 group by salesman_id

What you can't do, of course, if having something on your select clause (other than aggregate functions) that are not part of the group by clause.

Pablo Santa Cruz
Sadly enough MySQL *does* allow you to put everything in the select clause, even when it's not in the group by. This leads to broken queries actually getting executed. This way the problem is often hidden and becomes very hard to find.
Joachim Sauer
wow... didn't know that. sad to hear. thanks for the comment.
Pablo Santa Cruz
A: 

I don't know about other DBMS' but DB2/z, for one, does this just fine. It's not required to have the column in the select portion but, of course, it does have to extract the data from the table in order to aggregate so you're probably not saving any time by leaving it off. You should only select the columns that you need, aggregation of the data is a separate task from that.

I'm pretty certain the SQL standard allows this (although that's only based on the knowledge that the mainframe DB2 product follows it pretty closely).

paxdiablo
+3  A: 

Yes you can do it, but if you do that you won't be able to tell which result is for which group.

As a result, you almost always want to return the columns you've grouped by in the select clause. But you don't have to.

Mark Byers
+2  A: 

Yes.

This is often used in the superaggregate queries like this:

SELECT  AVG(cnt)
FROM    (
        SELECT  COUNT(*) AS cnt
        FROM    sales
        GROUP BY
                product
        HAVING  COUNT(*) > 10
        ) q

, which aggregate the aggregates.

Quassnoi
A: 

Hmm, I think the question should have been in the other way round like,

Can I SELECT something that is not there in the GROUP BY?

It's alright to write a code like:

SELECT customerId, count(orderId) FROM orders
 GROUP BY customerId, orderedOn

If you want to find out the number of orders done by a customer datewise. But you cannot do it the other way round:

SELECT customerId, orderedOn count(orderId) FROM orders
 GROUP BY customerId

You can issue an aggregate function on the column that is not there in the group by. But you cannot give it in the select line without the aggregate function. As it will not make much sense. Like for the above query. You group by just customerId for order counts and you want the date also to be printed in the output??!! You don't involve the date factor in the group for counting then will it mean something to have a date in it?

zapping