views:

68

answers:

1

We have some tables, which have a structure like:

start, -- datetime end, -- datetime cost -- decimal

So, for example, there might be a row like:

01/01/2010 10:08am, 01/01/2010 1:56pm, 135.00

01/01/2010 11:01am, 01/01/2010 3:22pm, 118.00

01/01/2010 06:19pm, 01/02/2010 1:43am, 167.00

Etc...

I'd like to get this into a format (with a function?) that returns data in a format like:

10:00am, 10:15am, X, Y, Z

10:15am, 10:30am, X, Y, Z

10:30am, 10:45am, X, Y, Z

10:45am, 11:00am, X, Y, Z

11:00am, 11:15am, X, Y, Z

....

Where: X = the number of rows that match

Y = the cost / expense for that chunk of time

Z = the total amount of time during this duration

IE, for the above data, we might have:

10:00am, 10:15am, 1, (135/228 minutes*7), 7

  • The first row starts at 10:08am, so only 7 minutes are used from 10:00-10:15.

  • There are 228 minutes in the start->end time.

....

11:00am, 11:15am, 2, ((135+118)/((228+261) minutes*(15+14)), 29

  • The second row starts right after 11:00am, so we need 15 minutes from the first row, plus 14 minutes from the second row

  • There are 261 minutes in the second start->end time

....

I believe I've done the math right here, but need to figure out how to make this into a PG function, so that it can be used within a report.

Ideally, I'd like to be able to call the function with some arbitrary duration, ie 15minute, or 30minute, or 60minute, and have it split up based on that.

Any ideas?

+1  A: 

Here is my try. Given this table definition:

CREATE TABLE interval_test
(
  "start" timestamp without time zone,
  "end" timestamp without time zone,
  "cost" integer
)

This query seems to do what you want. Not sure if it is the best solution, though. Also note that it needs Postgres 8.4 to work, because it uses WINDOW functions and WITH queries.

WITH RECURSIVE intervals(period_start) AS (
    SELECT 
    date_trunc('hour', MIN(start)) AS period_start
      FROM interval_test

  UNION ALL
    SELECT intervals.period_start + INTERVAL '15 MINUTES'
      FROM  intervals
      WHERE (intervals.period_start + INTERVAL '15 MINUTES') < (SELECT MAX("end") FROM interval_test)
  )
  SELECT DISTINCT period_start, intervals.period_start + INTERVAL '15 MINUTES' AS period_end, 
  COUNT(*) OVER  (PARTITION BY period_start ) AS record_count,
SUM (LEAST(period_start + INTERVAL '15 MINUTES', "end")::timestamp - GREATEST(period_start, "start")::timestamp)
  OVER  (PARTITION BY period_start ) AS total_time,

  (SUM(cost) OVER  (PARTITION BY period_start ) /  


 (EXTRACT(EPOCH FROM SUM("end" - "start") OVER  (PARTITION BY period_start )) / 60)) * 

 ((EXTRACT (EPOCH FROM SUM (LEAST(period_start + INTERVAL '15 MINUTES', "end")::timestamp - GREATEST(period_start, "start")::timestamp)
  OVER  (PARTITION BY period_start )))/60)

   AS expense

FROM  interval_test
INNER JOIN intervals ON (intervals.period_start, intervals.period_start + INTERVAL '15 MINUTES') OVERLAPS (interval_test.start, interval_test.end)

ORDER BY period_start ASC
jira
This looks pretty close! Seems to be missing the "cost" / expense element though.
Anthony
I updated the query to include the cost/expense. I'm not sure I got what exactly that calculation is supposed to do, but you should be able to modify the query for your purpose.
jira