views:

54

answers:

6

Just curious about SQL syntax. So if I have

SELECT 
 itemName as ItemName,
 substring(itemName, 1,1) as FirstLetter,
 Count(itemName)
FROM table1
GROUP BY itemName, FirstLetter

This would be incorrect because

GROUP BY itemName, FirstLetter 

really should be

GROUP BY itemName, substring(itemName, 1,1)

But why can't we simply use the former for convenience?

A: 

At least in PostgreSQL you can use the column number in the resultset in your GROUP BY clause:

SELECT 
 itemName as ItemName,
 substring(itemName, 1,1) as FirstLetter,
 Count(itemName)
FROM table1
GROUP BY 1, 2

Of course this starts to be a pain if you are doing this interactively and you edit the query to change the number or order of columns in the result. But still.

Bill Gribble
`GROUP BY FirstLetter` is allowed in Postgresql. To wit, try running this in Postgresql: select substring(table_name,1,2) as tnamefrom information_schema.tablesgroup by tname
Michael Buen
A: 

Some DBMSs will let you use an alias instead of having to repeat the entire expression.
Teradata is one such example.

I avoid ordinal position notation as recommended by Bill for reasons documented in this SO question.

The easy and robust alternative is to always repeat the expression in the GROUP BY clause.
DRY does NOT apply to SQL.

Adam Bernier
+1  A: 

You could always use a subquery so you can use the alias; Of course, check the performance (Possible the db server will run both the same, but never hurts to verify):

SELECT ItemName, FirstLetter, COUNT(ItemName)
FROM (
    SELECT ItemName, SUBSTRING(ItemName, 1, 1) AS FirstLetter
    FROM table1
    ) ItemNames
GROUP BY ItemName, FirstLetter
Chris Shaffer
A: 

Back in the day I found that Rdb, the former DEC product now supported by Oracle allowed the column alias to be used in the GROUP BY. Mainstream Oracle through version 11 does not allow the column alias to be used in the GROUP BY. Not sure what Postgresql, SQL Server, MySQL, etc will or won't allow. YMMV.

Bob Jarvis
+1  A: 

SQL Server doesn't allow you to reference the alias in the GROUP BY clause because of the logical order of processing. The GROUP BY clause is processed before the SELECT clause, so the alias is not known when the GROUP BY clause is evaluated. This also explains why you can use the alias in the ORDER BY clause.

Here is one source for information on the SQL Server logical processing phases.

bobs
+2  A: 

SQL is implemented as if a query was executed in the following order:

  1. FROM clause
  2. WHERE clause
  3. GROUP BY clause
  4. HAVING clause
  5. SELECT clause
  6. ORDER BY clause

This order in most relational database systems explains which names (columns or aliases) are valid because they must have been introduced in a previous step. There are exceptions though.

Codo
I like this explanation. Although I can't speculate how difficult it is to add it to an engine as syntactic sugar.
Haoest