ansaurus

Question

count number of rows that occur for each date in column date range.

Answer 1

+2 A:

Hi Will,

You could use the method described in these SO:

Basically: join with a generated calendar and GROUP BY your subset of columns.

SQL> WITH DATA AS (
  2  SELECT 'A' grp, to_date('01/01/01') start_date, to_date('01/03/01') end_date FROM DUAL
  3  UNION ALL SELECT 'A', to_date('01/01/01'), to_date('01/02/01') FROM DUAL
  4  UNION ALL SELECT 'A', to_date('01/03/01'), to_date('01/04/01') FROM DUAL
  5  UNION ALL SELECT 'B', to_date('01/01/01'), to_date('01/01/01') FROM DUAL
  6  ), calendar AS (
  7  SELECT to_date('01/01/01') + ROWNUM - 1 d
  8    FROM dual
  9    CONNECT BY LEVEL <= to_date('01/04/01') - to_date('01/01/01') + 1
 10  )
 11  SELECT data.grp, calendar.d, COUNT(*) cnt
 12    FROM data
 13    JOIN calendar ON calendar.d BETWEEN data.start_date AND data.end_date
 14   GROUP BY data.grp, calendar.d;

GRP D                  CNT
--- ----------- ----------
A   04/01/2001           1
A   02/01/2001           2
B   01/01/2001           1
A   03/01/2001           2
A   01/01/2001           2

Vincent Malgrat 2009-09-08 13:54:43

Answer 2

+1 A:

Typically I solve this kind of problem with a numbers table:

WITH Dates AS (
    SELECT DateAdd(d, Numbers.Number - 1, '1/1/2001') AS Date
    FROM Numbers
    WHERE Numbers.Number BETWEEN 1 AND 100000 -- Arbitrary date range
)
SELECT GroupData.Group, Dates.Date, COUNT(*)
FROM Dates
LEFT JOIN GroupData
    ON Dates.Date BETWEEN GroupData.StartDate AND GroupData.EndDate
GROUP BY GroupData.Group, Dates.Date
ORDER BY GroupData.Group, Dates.Date

Cade Roux 2009-09-08 13:57:38

Upvoted because it's quick and consice. However, this is SQL Server code, and it would need to be "translated" into PL/SQL.

Philip Kelley 2009-09-08 14:11:51

`@Philip Kelley`: this is perfectly valid `Oracle` code just as well. However, this will not generate count for each group/date: if for a given date there are not records for a certain group, all these groups will be merged together into a single record with a `NULL` as a group and `1` as a count.

Quassnoi 2009-09-08 14:15:40

@Quassnoi: this is not perfectly valid Oracle code. The DateAdd function does not exist and you cannot have a column called GROUP.

Rob van Wijk 2009-09-10 15:18:14

Answer 3

+2 A:

WITH    q AS
        (
        SELECT  (
                SELECT  MIN(start_date)
                FROM    mytable
                ) + level - 1 AS mydate
        FROM    dual
        CONNECT BY
                level <= (
                SELECT  MAX(end_date) - MIN(start_date)
                FROM    mytable
                )
        )
SELECT  group, mydate,
        (
        SELECT  COUNT(*)
        FROM    mytable mi
        WHERE   mi.group = mo.group
                AND q BETWEEN mi.start_date AND mi.end_date
        ) 
FROM    q
CROSS JOIN
        (
        SELECT  DISTINCT group
        FROM    mytable
        ) mo

Update:

A better and faster query making use of analytic functions.

The main idea is that the number of ranges containing each date is the difference before the count of ranges started before that date and the count of ranges that ended before it.

SELECT  cur_date,
        grouper,
        SUM(COALESCE(scnt, 0) - COALESCE(ecnt, 0)) OVER (PARTITION BY grouper ORDER BY cur_date) AS ranges
FROM    (
        SELECT  (
                SELECT  MIN(start_date)
                FROM    t_range
                ) + level - 1 AS cur_date
        FROM    dual
        CONNECT BY
                level <=
                (
                SELECT  MAX(end_date)
                FROM    t_range
                ) -
                (
                SELECT  MIN(start_date)
                FROM    t_range
                ) + 1
        ) dates
CROSS JOIN
        (
        SELECT  DISTINCT grouper AS grouper
        FROM    t_range
        ) groups
LEFT JOIN
        (
        SELECT  grouper AS sgrp, start_date, COUNT(*) AS scnt
        FROM    t_range
        GROUP BY
                grouper, start_date
        ) starts
ON      sgrp = grouper
        AND start_date = cur_date
LEFT JOIN
        (
        SELECT  grouper AS egrp, end_date, COUNT(*) AS ecnt
        FROM    t_range
        GROUP BY
                grouper, end_date
        ) ends
ON      egrp = grouper
        AND end_date = cur_date - 1
ORDER BY
        grouper, cur_date

This query completes in 1 second on 1,000,000 rows.

See this entry in my blog for more detail:

Oracle: generating a list of dates and counting ranges for each date

Quassnoi 2009-09-08 14:01:22

You beat me to it. Solid approach...

David Andres 2009-09-09 00:21:31

ansaurus

tags:

views:

answers:

count number of rows that occur for each date in column date range.

related questions