ansaurus

Question

How to think of aggregate functions in terms of sets

Answer 1

+1 A:

For any of your queries that perform aggregation to work, you need to group by the correct fields.

The first query should fail because the c.id, c.user_id, c.name, and c.created_at fields are not grouped using GROUP BY.

Similarly, the second query will fail as well because only the first field is grouped.

To get the last query to work, you might need to include the id in GROUP BY as well.

Aggregate functions only work when all of the non-aggregate elements of your SELECT clause (e.g., c.id, c.user_id, etc) represent the group being aggregated (i.e., are included in the GROUP BY clause).

David Andres 2009-09-08 01:58:13

I'm pretty sure MySQL is lax about enforcing that requirement, and will actually execute queries 2 and 3.

derobert 2009-09-08 01:59:24

good to know, but I thought it stayed close enough to SQL standard on this one

David Andres 2009-09-08 02:01:00

thanks. query 1, 2, and 3 will execute in mysql. I tried these queries before posting. mysql does not enforce any of these requirements, though only makes sense when being aggregated.

2009-09-08 03:04:37

Answer 2

+1 A:

GROUP BY doesn't make multiple sets. It makes one; in your case, its grouping by the pair (c.user_id, c.name). Any rows with the same (c.user_id, c.name) are put together in a group, and those are the groups count(*) will be working on.

derobert 2009-09-08 01:58:15

Answer 3

A:

Aggregate functions are computed after a) Joins b) filtering rows based on the where clause.

Picturize your data set once the join is completed and the rows have been filtered based on the where clause in your query. The group by clause will now subdivide your data set into distinct groups based on the columns specified in your where clause. All the rows in a given group will have the same value for all the columns specified in your where clause.

Once the rows in the original data set have been classified into groups, you can only query for (ie select the columns) that are common to a group. In your second example where you have grouped by c.user_id, there will be one group for each distinct user_id in your collections table. If you do not have a 'HAVING' clause in your query, your query will return one row per group. You can think (only think) of each group as a row containing columns. These columns would represent things that are common to the entire group like count(*), SUM, MAX, MIN etc.. The column name on which the groups are formed is also the same for each group!

The having clause is a like a where clause for groups. It filters out entire groups that do not satisfy the predicate in the having clause.

bkm 2009-09-08 04:50:56

ansaurus

tags:

views:

answers:

How to think of aggregate functions in terms of sets

related questions