Hi,
Lets say I am analyzing how high school sports records affect school attendance.
So I have a table in which each row corresponds to a high school basketball game. Each game has an away team id and a home team id (FK to another "team table") and a home score and an away score and a date. I am writing a query that matches attendance with this seasons basketball games.
My sample output will be (#_students_missed_class
, day_of_game
, home_team
, away_team
, home_team_wins_this_season
, away_team_wins_this_season
)
I now want to add how each team did the previous season to my analysis. Well, I have their previous season stored in the game table but i should be able to accomplish that with a subselect.
So in my main select statement I add the subselect:
SELECT COUNT(*) FROM game_table
WHERE game_table.date
BETWEEN 'start of previous season' AND 'end of previous season'
AND (
(game_table.home_team = team_table.id
AND game_table.home_score > game_table.away_score)
OR (game_table.away_team = team_table.id
AND game_table.away_score > game_table.home_score))
In this case team-table.id refers to the id of the home_team so I now have all their wins calculated from the previous year.
This method of calculation is neither time nor resource intensive. The Explain SQL shows that I have ALL in the Type field and I am not using a Key and the query times out. I'm not sure how I can accomplish a more efficient query with a subselect. It seems proposterously inefficient to have to write 4 of these queries (for home wins, home losses, away wins, away losses).
I am sure this could be more lucid. I'll absolutely add color tomorrow if anyone has questions