views:

48

answers:

3

I have data in 2 tables, and I want to create a report.

Table A:

  • tableAID (primary key)
  • name

Table B:

  • tableBID (primary key)
  • grade
  • tableAID (foreign key, references Table A)

There is much more to both tables, but those are the relevant columns.

The query I want to run, conceptually, is this:

select TableA.name, avg(TableB.grade) where TableB.tableAID = TableA.tableAID

The problem of course is that I'm using an aggregate function (avg), and I can rewrite it like this:

select avg(grade), tableAID from TableB group by tableAID

but then I only get the ID of TableA, whereas I really need that name column which appears in TableA, not just the ID.

Is it possible to write a query to do this in one statement, or would I first need to execute the second query I listed, get the list of id's, then query each record in TableA for the name column... seems to me I'm missing something obvious here, but I'm (quite obviously) not an sql guru...

+3  A: 

You can do this:

SELECT avg(b.grade), a.tableAID, a.name 
FROM TableA a 
     JOIN TableB b
       ON b.tableAID = a.tableAID
GROUP BY a.tableAID, a.name

Just adding it to the group will work fine in your case.

Nick Craver
+1 For speed of response - was that 15 seconds or 20 seconds after it was posted!
amelvin
Yeah, somehow he's pre-computing the answers. Or possibly just a very good caching algorithm.
Larry Lustig
@amelvin, @Larry - I confess, I'm a bot :(
Nick Craver
"... will work fine in your case." Can I assume that I can add more columns to the "group by" from TableA that I might need as well, without any performance issues?
@user85116 - This is correct, the optimizer will take care of this pretty well, just add whatever columns you need. If in the off-chance you do experience some sort of performance degradation then post another question with your query/table setup and you'll get specific optimization/indexing tips...lots of SQL optimization experts in the community.
Nick Craver
If there is only one name per table A record, there's no need to specify the name in the GROUP clause. Grouping only on the index column would be preferred. The only reason that Nick has added the name to the GROUP clause is because his database vendor, and probably yours also, prevents him from having a column in the SELECT clause that isn't either an aggregate function or in the GROUP clause. In MySQL, you don't have this restriction, so you could leave name out of the GROUP column.
Marcus Adams
@Marcus - That's indeed the reason, and I hate that most providers don't figure this out automatically as well.
Nick Craver
+2  A: 
 SELECT AVG(TableB.grade), TableB.tableAID, TableA.Name 
      FROM TableA INNER JOIN TableB
      ON TableA.TableAID = TableB.TableAID
      GROUP BY TableA.tableAID, TableA.Name
Larry Lustig
+1 To encourage you to build a better pre-cog algorithm
amelvin
A: 

Alternative answer:

SELECT AVG(b.grade), a.tableAID, MAX(a.name )
FROM TableA a 
     JOIN TableB b
       ON b.tableAID = a.tableAID
GROUP BY a.tableAID

Just to get you thinking.

Marcus Adams