views:

33

answers:

1

I have information on school athletics, with tables for school, season, cashflow, and cashflow_group. I'm trying to query for all schools with cashflow in one or more given cashflow_groups within a user-specified range. I need to query multiple different categories in the same query. I'm having trouble.

My query is below. The reason I did it this way is that I can sum multiple cashflow groups, and I thought it worked until I looked closely and saw that it sums the cashflow amount for all schools as total_cashflow_amount, when each school should have a different total, the sum of its associated cashflow rows.

SELECT distinct schools.*, 
  (SELECT sum(`cashflows`.amount) AS cf FROM `schools` 
  INNER JOIN `seasons` ON seasons.school_id = schools.id 
  INNER JOIN `cashflows` ON cashflows.season_id = seasons.id 
  INNER JOIN `cashflow_groups` ON `cashflow_groups`.id = `cashflows`.cashflow_group_id 
  WHERE ((`cashflow_groups`.id = 12) AND (`seasons`.`year` = 2010))) AS total_branding_cashflow 
FROM `schools` 

INNER JOIN `seasons` ON seasons.school_id = schools.id 
INNER JOIN `cashflows` ON cashflows.season_id = seasons.id 
INNER JOIN `cashflow_groups` ON `cashflow_groups`.id = `cashflows`.cashflow_group_id 
INNER JOIN `seasons` seasons_schools ON seasons_schools.school_id = schools.id 
WHERE (`seasons`.`year` = 2010) 
GROUP BY schools.id 
HAVING (total_branding_cashflow BETWEEN 50000000 AND 100000000) 
ORDER BY schools.name ASC LIMIT 0, 50

In this query, total_branding_cashflow is the total for all schools. I can't figure out how to get the total for each school individually in the subquery.

As it stands, I get a result like

| school.id | … | total_branding_cashflow |
|     2     |   |       900000            |
|     5     |   |       900000            |

when what I want is

| school.id | … | total_branding_cashflow |
|     2     |   |       500000            |
|     5     |   |       400000            |

Adding a GROUP BY to the subquery gives me the sum of each school's cashflow in a separate row, but subqueries only work when they give a single row, so this doesn't help me.

What am I missing? The reason I though to use subqueries is that I want to be able to look up multiple different cashflow_groups at the same time, like this:

SELECT distinct schools.*, 
  (SELECT sum(`cashflows`.amount) AS cf FROM `schools` 
  INNER JOIN `seasons` ON seasons.school_id = schools.id 
  INNER JOIN `cashflows` ON cashflows.season_id = seasons.id 
  INNER JOIN `cashflow_groups` ON `cashflow_groups`.id = `cashflows`.cashflow_group_id 
WHERE ((`cashflow_groups`.id = 12) AND (`seasons`.`year` = 2010)) ) AS total_branding_cashflow, 
  (SELECT sum(`cashflows`.amount) AS cf FROM `schools` 
  INNER JOIN `seasons` ON seasons.school_id = schools.id 
  INNER JOIN `cashflows` ON cashflows.season_id = seasons.id 
  INNER JOIN `cashflow_groups` ON `cashflow_groups`.id = `cashflows`.cashflow_group_id 
WHERE ((`cashflow_groups`.id = 1) AND (`seasons`.`year` = 2010)) ) AS total_ticket_sales_cashflow,
  (SELECT sum(`cashflows`.amount) AS cf FROM `schools` 
  INNER JOIN `seasons` ON seasons.school_id = schools.id 
  INNER JOIN `cashflows` ON cashflows.season_id = seasons.id 
  INNER JOIN `cashflow_groups` ON `cashflow_groups`.id = `cashflows`.cashflow_group_id 
WHERE ((`cashflow_groups`.id = 7) AND (`seasons`.`year` = 2010)) ) AS total_university_cashflow 
FROM `schools` 
INNER JOIN `seasons` ON seasons.school_id = schools.id 
INNER JOIN `cashflows` ON cashflows.season_id = seasons.id 
INNER JOIN `cashflow_groups` ON `cashflow_groups`.id = `cashflows`.cashflow_group_id 
INNER JOIN `seasons` seasons_schools ON seasons_schools.school_id = schools.id 
WHERE (`seasons`.`year` = 2010) 
GROUP BY schools.id 
HAVING (total_branding_cashflow BETWEEN 50000000 AND 100000000) AND 
       (total_ticket_sales_cashflow BETWEEN 50000000 AND 100000000) AND 
       (total_university_cashflow BETWEEN 0 AND 10000000) 
ORDER BY schools.name ASC LIMIT 0, 50

I didn't think I could do this with a SUM that isn't in its own subquery. I'm working on a rails app, and could probably come up with a way to do this via ruby code. But that doesn't seem right and I'd prefer to get it down in SQL if possible. Thanks!

+1  A: 

Some suggestions:

  • Join on seasons once. A join causes rows from the left table to be duplicated, so they can be summed twice by the sum aggregate. When in doubt, run the query without group by for an example school.
  • You'd have to relate the subquery to the outer query with something like inner_schools.id = outer_schools.id
  • But as far as I can see, you don't need a subquery at all

For example:

SELECT  schools.*
,       sum(cashflows.amount) total_branding_cashflow
FROM    schools
JOIN    seasons
ON      seasons.school_id = schools.id 
        and seasons.year = 2010
JOIN    cashflows
ON      cashflows.season_id = seasons.id 
        and cashflow_group_id = 12
GROUP BY 
        schools.id 
HAVING  total_branding_cashflow BETWEEN 50000000 AND 100000000

For multiple categories, you could use a case:

SELECT  schools.*
,       sum(case when cashflow_group_id = 1 then cashflows.amount end) total1
,       sum(case when cashflow_group_id = 12 then cashflows.amount end) total12
FROM    schools
JOIN    seasons
ON      seasons.school_id = schools.id 
        and seasons.year = 2010
JOIN    cashflows
ON      cashflows.season_id = seasons.id 
GROUP BY 
        schools.id 
Andomar
You type too fast. :)
Tom H.
Thanks, that's a lot simpler and nicer. I'm playing around with it now: but the reason I originally bothered with subqueries at all is because I want to have the ability to filter off **multiple** cashflow_groups in the same query. My understanding was that when I do the join this way the multiple part won't work? Because I ended up joining on cashflows where the cashflow_group_id was both 12 and 1, and that's an empty set. And 12 or 1 would give me one sum for the two categories? I'd like very much if I could simplify it like this though! I'll keep playing around.
kjell_
Got it: I had no idea about multiple joins on the same table. But great.`JOIN cashflows AS branding_cf ON branding_cf.season_id = seasons.id and branding_cf.cashflow_group_id = 12``JOIN cashflows AS tickets_cf ON tickets_cf.season_id = seasons.id and tickets_cf.cashflow_group_id = 1`
kjell_
@jkell_: Multiple joins on the same table will probably give you the wrong result: aren't the sums suspiciously equal? You can try an alternative approach added at the end of my answer
Andomar
Yep, you're right. I didn't have enough data for it to look completely out of whack but as I added more it got bad. Thanks for the sum(case strategy.
kjell_