I'm working on the reports for our time tracking app. Each time entry is related to a project and a service. Here a simplified query to group time entries by project and service.
SELECT
projects.name as project_name,
services.name as service_name,
SUM(minutes) AS minutes
FROM `time_entries`
JOIN `projects` ON `projects`.id = `time_entries`.project_id
JOIN `services` ON `services`.id = `time_entries`.service_id
GROUP BY
time_entries.project_id,
time_entries.service_id
ORDER BY
max(minutes) DESC
This will result in a table like that:
+---------------+--------------+---------+
| project_name | service_name | minutes |
+---------------+--------------+---------+
| Business Card | Consulting | 4800 |
| Microsite | Coding | 3200 |
| Microsite | Consulting | 2400 |
| Microsite | Design | 2400 |
| Business Card | Design | 800 |
+---------------+--------------+---------+
What I try to achieve though is the possibility, to sort by SUM'd project minutes. Not the project »Business Card« should be on top, but the project »Microsite«, because it has more minutes.
+---------------+--------------+-----------------+---------+
| project_name | service_name | project_minutes | minutes |
+---------------+--------------+-----------------+---------+
| Microsite | Coding | 8000 | 3200 |
| Microsite | Consulting | 8000 | 2400 |
| Microsite | Design | 8000 | 2400 |
| Business Card | Consulting | 5600 | 4800 |
| Business Card | Design | 5600 | 800 |
+---------------+--------------+-----------------+---------+
The only way I've found to get the column »project_minutes« is to create a table first and join it with itself. The query I came up with:
DROP TABLE IF EXISTS group2;
CREATE TABLE group2 SELECT
projects.id as project_id,
projects.name as project_name,
services.name as service_name,
SUM(minutes) AS minutes
FROM `time_entries`
JOIN `projects` ON `projects`.id = `time_entries`.project_id
JOIN `services` ON `services`.id = `time_entries`.service_id
GROUP BY
time_entries.project_id,
time_entries.service_id
ORDER BY
max(minutes) DESC
LIMIT 0, 30;
SELECT
project_name, service_name, project_minutes, minutes
FROM
group2
LEFT JOIN
(
SELECT project_id as project_id, sum(minutes) AS project_minutes
FROM group2
GROUP BY project_id
) as group1 on group1.project_id = group2.project_id
ORDER BY
project_minutes DESC,
minutes DESC;
I'm not even able to create a temporary table to due to a mySQL Bug (?): http://www.google.com/search?&q=site:bugs.mysql.com+reopen+temporary+table
My questions:
- What would be the best way to achieve a column like »project_minutes« that SUMs up a groups minutes and adds the result as an extra column? Is there a neat SQL trick I'm not aware of?
- If you don't see a way for my first question, do you think it makes sense to go with creating an extra table for each query? Is it faster than doing this logic manually after in the code? We use Rails, in case that makes any difference.
Thanks a lot for your help!
UPDATE
Thanks for your replies so far. I'm summarizing them as a gist to get a better overview: http://gist.github.com/553560
Am I right that there is no other way than querying the time_entries table once per each group by statement? If yes, do you see performance issues because of the following facts:
- The table time_entries is by far the one with the most rows (~4 million)
- The user can group by up to 6 columns. Have a look at this screenshot: http://dl.dropbox.com/u/732913/time_entries_grouped_by_customer_project_service_user.png