I'm looking for a best practice advice how to speed up queries and at the same time to minimize the overhead needed to invoke date/mktime functions. To trivialize the problem I'm dealing with the following table layout:
CREATE TABLE my_table(
id INTEGER PRIMARY KEY NOT NULL AUTO_INCREMENT,
important_data INTEGER,
date INTEGER);
The user can choose to show 1) all entries between two dates:
SELECT * FROM my_table
WHERE date >= ? AND date <= ?
ORDER BY date DESC;
Output:
10-21-2009 12:12:12, 10002
10-21-2009 14:12:12, 15002
10-22-2009 14:05:01, 20030
10-23-2009 15:23:35, 300
....
I don't think there is much to improve in this case.
2) Summarize/group the output by day, week, month, year:
SELECT COUNT(*) AS count, SUM(important_data) AS important_data
FROM my_table
WHERE date >= ? AND date <= ?
ORDER BY date DESC;
Example output by month:
10-2009, 100002
11-2009, 200030
12-2009, 3000
01-2010, 0 /* <- very important to show empty dates, with no entries in the table! */
....
To accomplish option 2) I'm currently running a very costly for-loop with mktime/date like the following:
for(...){ /* example for group by day */
$span_from = (int)mktime(0, 0, 0, date("m", $time_min), date("d", $time_min)+$i, date("Y", $time_min));
$span_to = (int)mktime(0, 0, 0, date("m", $time_min), date("d", $time_min)+$i+1, date("Y", $time_min));
$query = "..";
$output = date("m-d-y", ..);
}
What are my ideas so far? Add additional/ redundant columns (INTEGER) for day (20091212), month (200912), week (200942) and year (2009). This way I can get rid of all the unnecessary queries in the for loop. However I'm still facing the problem to very fastly calculate all dates that doesn't have any equivalent in database. One way to simply move the problem could be to let MySQL do the job and simply use one big query (calculate all the dates/use MySQL date functions) with a left join (the data). Would it be wise to let MySQL take the extra load? Anyway I'm reluctant to use all these mktime/date in the for loop. Since I have complete control over the table layout and code even suggestions with major changes are welcome!
Update
Thanks to Greg I came up with the following SQL query. However it still bugs me to use 50 lines of sql statements - build up with php - that maybe could be done faster and more elegantly otherwise:
SELECT * FROM (
SELECT DATE_ADD('2009-01-30', INTERVAL 0 DAY) AS day UNION ALL
SELECT DATE_ADD('2009-01-30', INTERVAL 1 DAY) AS day UNION ALL
SELECT DATE_ADD('2009-01-30', INTERVAL 2 DAY) AS day UNION ALL
SELECT DATE_ADD('2009-01-30', INTERVAL 3 DAY) AS day UNION ALL
......
SELECT DATE_ADD('2009-01-30', INTERVAL 50 DAY) AS day ) AS dates
LEFT JOIN (
SELECT DATE_FORMAT(date, '%Y-%m-%d') AS date, SUM(data) AS data
FROM test
GROUP BY date
) AS results
ON DATE_FORMAT(dates.day, '%Y-%m-%d') = results.date;