views:

271

answers:

7

Given the following table 'foo'

ID | First Name | Last Name
----------------------------
67   John        Smith
----------------------------
67   Bill        Jacobs

What first_name and last_name will the following query return and why?

SELECT * FROM foo WHERE ID = 67 GROUP BY ID
A: 

It's very likely that the second (last) row's first name and last name will be picked.

You can add an ORDER BY clause to give hints on how you'd like the grouped rows to be sorted.

Charles
No, MySQL returns values from the first row in the group. ORDER BY is applied to rows after groups have been formed--it doesn't affect the order of rows within the group at all.
Bill Karwin
I stand corrected :)
Charles
+1  A: 

MySQLs group by is not consistent with the standard SQL behaviour , MySQL makes it easy to get other columns BUT at the same time u can never be sure which one u will get.

Update: refer to this page: http://dev.mysql.com/doc/refman/5.0/en/group-by-hidden-columns.html

When using this feature, all rows in each group should have the same values for the columns that are ommitted from the GROUP BY part. The server is free to return any value from the group, so the results are indeterminate unless all values are the same.

Sabeen Malik
A: 

In standard SQL, This SQL should fail, with a server processor error something like

"firstname, and lastname cannot be included in the select clause unless they are also in the group By, or are part of an aggregate function."

Does MySql actually return data for this ?

Charles Bretana
Yes, MySQL allows you to do this
Greg
+1  A: 

It is undefined, which result you are going to get.

I was always wondering why this behaviour was even allowed. Really, I wish such code would just generate an error (preferrably, a decipherable one, none of that usual MySQL's "your statement has a problem, but I don't know where" stuff).

shylent
+7  A: 

MySQL chooses a row arbitrarily. In practice, commonly used MySQL storage engines return the values from the first row in the group, with respect to the physical storage.

create table foo (id serial primary key, category varchar(10));

insert into foo (category) values 
  ('foo'), ('foo'), ('foo'), ('bar'), ('bar'), ('bar');

select * from foo group by category;

+----+----------+
| id | category |
+----+----------+
|  4 | bar      |
|  1 | foo      |
+----+----------+

Other folks are correct that MySQL allows you to run this query even though it has arbitrary and potentially misleading results. The SQL standard, and most other RDBMS vendors, disallow this kind of ambiguous GROUP BY query. This is called the Single-Value Rule: all columns in the select-list must be explicitly part of the GROUP BY criteria, or else inside an aggregate function, e.g. COUNT(), MAX(), etc.

MySQL supports a SQL mode ONLY_FULL_GROUP_BY that makes MySQL return an error if you try to run a query that violates SQL standard semantics.

AFAIK, SQLite is the only other RDBMS that allows ambiguous columns in a grouped query. SQLite returns values from the last row in the group:

select * from foo group by category;

6|bar
3|foo


We can imagine queries that would not be ambiguous, yet still violate the SQL standard semantics.

SELECT foo.*, parent_of_foo.* 
FROM foo JOIN parent_of_foo 
  ON (foo.parent_id = parent_of_foo.parent_id) 
GROUP BY foo_id;

There's no logical way this could produce ambiguous results. Each row in foo gets its own group, if we GROUP BY the primary key of foo. So any column from foo can have only one value in the group. Even joining to another table referenced by a foreign key in foo can have only one value per group, if the groups are defined by the primary key of foo.

MySQL and SQLite trust you to design logically unambiguous queries. Formally, every column in the select-list must be a functional dependency of the columns in the GROUP BY criteria. If you don't adhere to this, it's your fault. :-)

Standard SQL is more strict and disallows some queries that could be unambiguous--probably because it would be too complex for the RDBMS to be sure in general.

Bill Karwin
A: 

MySQLs group by is not consistent with the standard SQL behaviour , MySQL makes it easy to get other columns BUT at the same time u can never be sure which one u will get.

True. Actually it corresponds more to SELECT DISTINCT ON mode in postgres for instance, except this allows you to specify the order of the rows prior to distinctization (?) and therefore which row you will get (ie most recent, oldest, whatever).

Note MySQL in "sql compliant" mode will reject GROUP BY with nonspecified columns as in your example.

peufeu
A: 

This is a question not answer. So how can you produce the result like Sqlite (return the last result)?

The reason I am trying to do this.

Imagine a table with salesman , sales_amount, sales_date

so I want to find out total which salesman done the last deal they have done

SELECT SUM(sales_amount) AS total_amount , salesman, sales_date FROM sales GROUP BY salesman ORDER BY sales_date DESC

But this doesn't work, only return the first result not the latest one ...

Joel
-1 Ask it as a question then! Link to this question if appropriate.
Martin Smith