GROUP BY
maps groups of rows to one row, per distinct value in specific columns, which don't even necessarily have to be in the select-list.
SELECT b, c, d FROM table1 GROUP BY a;
This query is legal SQL (correction: only in MySQL; actually it's not standard SQL and not supported by other brands). MySQL accepts it, and it trusts that you know what you're doing, selecting b
, c
, and d
in an unambiguous way because they're functional dependencies of a
.
However, Microsoft SQL Server and other brands don't allow this query, because it can't determine the functional dependencies easily. edit: Instead, standard SQL requires you to follow the Single-Value Rule, i.e. every column in the select-list must either be named in the GROUP BY
clause or else be an argument to a set function.
Whereas DISTINCT
always looks at all columns in the select-list, and only those columns. It's a common misconception that DISTINCT
allows you to specify the columns:
SELECT DISTINCT(a), b, c FROM table1;
Despite the parentheses making DISTINCT
look like function call, it is not. It's a query option and a distinct value in any of the three fields of the select-list will lead to a distinct row in the query result. One of the expressions in this select-list has parentheses around it, but this won't affect the result.