We are coding for a MySQL data warehousing application that stores descriptive data (User ID, Work ID, Machine ID, Start and End Time columns in the first table below) associated with time and production quantity data (Output and Time columns in the first table below) upon which aggregate (SUM, COUNT, AVG) functions are applied. We now wish to dis-aggregate time data for another type of analysis.
Our current data table design:
+---------+---------+------------+---------------------+---------------------+--------+------+
| User ID | Work ID | Machine ID | Event Start Time | Event End Time | Output | Time |
+---------+---------+------------+---------------------+---------------------+--------+------+
| 080025 | ABC123 | M01 | 2008-01-24 16:19:15 | 2008-01-24 16:34:45 | 2120 | 930 |
+---------+---------+------------+---------------------+---------------------+--------+------+
Reprocessing dis-aggregation that we would like to do would be to transform table content based on a granularity of minutes, rather than the current production event ("Event Start Time" and "Event End Time") granularity. The resulting reprocessing of existing table rows would look like:
+---------+---------+------------+---------------------+--------+
| User ID | Work ID | Machine ID | Production Minute | Output |
+---------+---------+------------+---------------------+--------+
| 080025 | ABC123 | M01 | 2010-01-24 16:19 | 133 |
| 080025 | ABC123 | M01 | 2010-01-24 16:20 | 133 |
| 080025 | ABC123 | M01 | 2010-01-24 16:21 | 133 |
| 080025 | ABC123 | M01 | 2010-01-24 16:22 | 133 |
| 080025 | ABC123 | M01 | 2010-01-24 16:23 | 133 |
| 080025 | ABC123 | M01 | 2010-01-24 16:24 | 133 |
| 080025 | ABC123 | M01 | 2010-01-24 16:25 | 133 |
| 080025 | ABC123 | M01 | 2010-01-24 16:26 | 133 |
| 080025 | ABC123 | M01 | 2010-01-24 16:27 | 133 |
| 080025 | ABC123 | M01 | 2010-01-24 16:28 | 133 |
| 080025 | ABC123 | M01 | 2010-01-24 16:29 | 133 |
| 080025 | ABC123 | M01 | 2010-01-24 16:30 | 133 |
| 080025 | ABC123 | M01 | 2010-01-24 16:31 | 133 |
| 080025 | ABC123 | M01 | 2010-01-24 16:22 | 133 |
| 080025 | ABC123 | M01 | 2010-01-24 16:33 | 133 |
| 080025 | ABC123 | M01 | 2010-01-24 16:34 | 133 |
+---------+---------+------------+---------------------+--------+
So the reprocessing would take an existing row of data created at the granularity of production event and modify the granularity to minutes, eliminating redundant (Event End Time, Time) columns while doing so. It assumes a constant rate of production and divides output by the difference in minutes plus one to populate the new table's Output column.
I know this can be done in code...but can it be done entirely in a MySQL insert statement (or otherwise entirely in MySQL)? I am thinking of a INSERT ... INTO construction but keep getting stuck. An additional complexity is that there are hundreds of machines to include in the operation so there will be multiple rows (one for each machine) for each minute of the day.
Any ideas would be much appreciated. Thanks.