tags:

views:

467

answers:

3

Hi, I would appreciate a little expert help please.

in an SQL SELECT statement I am trying to get the last day with data per month for the last year. Example, I am easily able to get the last day of each month and join that to my data table, but the problem is, if the last day of the month does not have data, then there is no returned data. What I need is for the SELECT to return the last day with data for the month.

This is probably easy to do, but to be honest, my brain fart is starting to hurt.

I've attached the select below that works for returning the data for only the last day of the month for the last 12 months.

Thanks in advance for your help!

SELECT fd.cust_id,fd.server_name,fd.instance_name,
    TRUNC(fd.coll_date) AS coll_date,fd.column_name
FROM super_table fd,
    (SELECT TRUNC(daterange,'MM')-1 first_of_month
    FROM (
    select TRUNC(sysdate-365,'MM') + level as DateRange
    from    dual
    connect by level<=365)
    GROUP BY TRUNC(daterange,'MM')) fom
WHERE fd.cust_id = :CUST_ID
AND fd.coll_date > SYSDATE-400
AND TRUNC(fd.coll_date) = fom.first_of_month
GROUP BY fd.cust_id,fd.server_name,fd.instance_name,
    TRUNC(fd.coll_date),fd.column_name
ORDER BY fd.server_name,fd.instance_name,TRUNC(fd.coll_date)
+1  A: 

You probably need to group your data so that each month's data is in the group, and then within the group select the maximum date present. The sub-query might be:

SELECT MAX(coll_date) AS last_day_of_month
    FROM Super_Table AS fd
    GROUP BY YEAR(coll_date) * 100 + MONTH(coll_date);

This presumes that the functions YEAR() and MONTH() exist to extract the year and month from a date as an integer value. Clearly, this doesn't constrain the range of dates - you can do that, too. If you don't have the functions in Oracle, then you do some sort of manipulation to get the equivalent result.

Using information from Rhose (thanks):

SELECT MAX(coll_date) AS last_day_of_month
    FROM Super_Table AS fd
    GROUP BY TO_CHAR(coll_date, 'YYYYMM');

This achieves the same net result, putting all dates from the same calendar month into a group and then determining the maximum value present within that group.

Jonathan Leffler
For Oracle you could use GROUP BY TO_CHAR(coll_date,'YYYYMM')
Rhose
That's not Oracle syntax. Replace the group by clause with GROUP BY TRUNC(coll_date,'MM') and it'll work.
Rob van Wijk
@Rhose: that makes sense to me - would work nicely. @Rob van Wijk: what does that TRUNC do? If it generates just the month portion (as I suspect it does), then data from the same month in different years would be grouped together, leading to the wrong answer.
Jonathan Leffler
@Jonathan: "what does that TRUNC do?" http://download.oracle.com/docs/cd/B28359_01/server.111/b28286/functions209.htm#SQLRF06151. It does not work as you suspect.
Rob van Wijk
Thanks for the x-ref, Rob. There are times when knowing one DBMS does not help with another - and this is one of those times (and dates, in this context:D).
Jonathan Leffler
+1  A: 

Putting the above pieces together, would something like this work for you?

 SELECT fd.cust_id,
       fd.server_name,
       fd.instance_name,
       TRUNC(fd.coll_date) AS coll_date,
       fd.column_name
  FROM super_table fd,
 WHERE fd.cust_id = :CUST_ID
   AND TRUNC(fd.coll_date) IN (
                                SELECT MAX(TRUNC(coll_date))
                                  FROM super_table
                                 WHERE coll_date > SYSDATE - 400
                                   AND cust_id = :CUST_ID
                                 GROUP BY TO_CHAR(coll_date,'YYYYMM')
                              )
 GROUP BY fd.cust_id,fd.server_name,fd.instance_name,TRUNC(fd.coll_date),fd.column_name
 ORDER BY fd.server_name,fd.instance_name,TRUNC(fd.coll_date)
Rhose
+1  A: 

Here's another approach, if ANSI row_number() is supported:

with RevDayRanked(itemDate,rn) as (
  select
    cast(coll_date as date),
    row_number() over (
      partition by datediff(month,coll_date,'2000-01-01') -- rewrite datediff as needed for your platform
      order by coll_date desc
    )
  from super_table
)
  select itemDate
  from RevDayRanked
  where rn = 1;

Rows numbered 1 will be nondeterministically chosen among rows on the last active date of the month, so you don't need distinct. If you want information out of the table for all rows on these dates, use rank() over days instead of row_number() over coll_date values, so a value of 1 appears for any row on the last active date of the month, and select the additional columns you need:

with RevDayRanked(cust_id, server_name, coll_date, rk) as (
  select
    cust_id, server_name, coll_date,
    rank() over (
      partition by datediff(month,coll_date,'2000-01-01')
      order by cast(coll_date as date) desc
    )
  from super_table
)
  select cust_id, server_name, coll_date
  from RevDayRanked
  where rk = 1;

If row_number() and rank() aren't supported, another approach is this (for the second query above). Select all rows from your table for which there's no row in the table from a later day in the same month.

select
  cust_id, server_name, coll_date
from super_table as ST1
where not exists (
  select *
  from super_table as ST2
  where datediff(month,ST1.coll_date,ST2.coll_date) = 0
  and cast(ST2.coll_date as date) > cast(ST1.coll_date as date)
)

If you have to do this kind of thing a lot, see if you can create an index over computed columns that hold cast(coll_date as date) and a month indicator like datediff(month,'2001-01-01',coll_date). That'll make more of the predicates SARGs.

Steve Kass