views:

44

answers:

0

I have a table in a PostgreSQL database which tracks usage of various resources. The (simplified) Schema of the table is that each row has a ResourceID, StartTime Timestamp and EndTime Timestamp. Each row in the table represents a timespan in which the resource was in use, so the table might look like: (Note, timestamps also include dates, removed below for clarity)

ResourceID  StartTime   EndTime
---------------------------------------
1           12:30:00    12:45:00
1           12:48:25    12:50:22
2           12:32:50    12:33:44

The database would have perhaps a thousand different resources tracked and a few million rows in the table. I've recently received a feature request for a new report which details time periods in that a group of resources are all in use, so the query might be "Between 12:00 and 15:00, display all the time periods when resources 1,2,5,8 and 12 were all in use". In addition,the query should have a "Minimum Idle" period, which a resource needs to be idle for before being considered idle, (example: If Minimum Idle is 2 seconds, a resource in use 12:00:00-12:01:00 and 12:01:01 to 12:02:00 would not be considered to have any idle time,even though technically it was not in use for 1 second).

The output of the query would be a list of starttime/endtimes of all times when all the queried resources were in use. From that point, I need to compute some statistics on that dataset, which won't be a problem for me, but I'm at a loss on how to efficiently create that dataset from the original table. If necessary I can log additional information to the database at insert time, and if not for the arbitary resource subset requirement, I could just create a table of all the idle times then, but with 1000 different resources and any possible combination of 1-1000 resources in a query, that seems excessive as only a very small number of combinations will ever be reported on.

Thanks in advance for any help or insights.