ansaurus

Question

Why no "SELECT foo.* ... GROUP BY foo.id" in Postgres?

Answer 1

+1 A:

What exactly would you have postgresql output? You're using an aggregate function and trying to output "something".

Ah. I see what you may want to do. Use a subselect.

select foo.*, (select count(*) from bar where bar.foo_id=foo.id) from foo;

Check with explain that the plan looks good though. A subselect is not always bad. I just checked with a database I'm using and my execution plan was good for that query.

Yes, in theory grouping by foo.id would be enough (i.e.: your query plus "group by foo.id"). But apparently (I tested it) postgresql will not do that. The other option is to "group by foo.id, foo.foo, foo.bar, foo.baz" and everything else that's in "foo.*".

Another way, that Guffa is on to, is this:

SELECT foo.*, COALESCE(sub.cnt, 0)
FROM foo
LEFT OUTER JOIN (
  SELECT foo_id, count(*) AS cnt
  FROM bar
  GROUP BY foo_id) sub
ON sub.foo_id = foo.id;

This will be two queries though (one subquery, which is run just once), which can matter, but probably won't. If you can just do without "foo.*" you can use the second version that explicitly groups by all columns.

Thomas 2009-07-16 07:47:37

Answer 2

A:

A GROUP BY clause requires that every column that the query returns is either a column contained in the GROUP BY statement or an aggregate function (such as the COUNT in your example). Without seeing what your GROUP BY clause is or what the columns of foo are, it's hard to tell what exactly is going on, but I would guess the problem is that foo.* is trying to return one or several columns that are not in your GROUP BY clause.

This is really a general property of SQL and should not be specific to PostgreSQL. No idea why it worked for you with SQLite or MySQL -- perhaps all the columns in foo.* are actually in your GROUP BY clause but PostgreSQL can't figure that out -- so try listing out all of the columns of foo explicitly.

Martin B 2009-07-16 07:53:50

Answer 3

+2 A:

Some databases are more relaxed about this, for good and bad. The query is unspecific, so the result is equally unspecific. If the database allows the query, it will return one record from each group and it won't care which one. Other databases are more specific, and require you to specify which value you want from the group. They won't let you write a query that has an unspecific result.

The only values that you can select without an aggregate is the ones in the group by clause:

select foo.id, count(bar.id)
from foo inner join bar on foo.id = bar.foo_id
group by foo.id

You can use aggregates to get other values:

select foo.id, min(foo.price), count(bar.id)
from foo inner join bar on foo.id = bar.foo_id
group by foo.id

If you want all the values from the foo table, you can either put them all in the group by clause (if that gives the correct result):

select foo.id, foo.price, foo.name, foo.address, count(bar.id)
from foo inner join bar on foo.id = bar.foo_id
group by foo.id, foo.price, foo.name, foo.address

Or, you can join the table with a subquery:

select foo.id, foo.price, foo.name, foo.address, sub.bar_count
from foo
inner join (
   select foo.id, bar_count = count(bar.id)
   from foo inner join bar on foo.id = bar.foo_id
   group by foo.id
) sub on sub.id = foo.id

Guffa 2009-07-16 10:43:37

That last one will not work if there aren't any entries in bar for "a foo" you'll have to do: SELECT foo.*, COALESCE(sub.bar_count) from foo left outer join (select ...) sub on sub.id = foo.id;

Thomas 2009-07-17 08:00:03

Also, you don't need to do a join in the subselect.

Thomas 2009-07-17 08:07:08

err.. my comment version missed a few characters. See my answer above for what I meant.

Thomas 2009-07-17 08:09:44

ansaurus

tags:

views:

answers:

Why no "SELECT foo.* ... GROUP BY foo.id" in Postgres?

related questions