Hi I am faced with a difficult problem:
I have a table (oracle 9i) of weather forecasts (many 100's of millions of records in size.) whose makeup looks like this:
stationid forecastdate forecastinterval forecastcreated forecastvalue
---------------------------------------------------------------------------------
varchar (pk) datetime (pk) integer (pk) datetime (pk) integer
where:
stationid
refers to one of the many weather stations that may create a forecast;forecastdate
refers to the date the forecast is for (date only not time.)forecastinterval
refers to the hour in theforecastdate
for the forecast (0 - 23).forecastcreated
refers to the time the forecast was made, can be many days beforehand.forecastvalue
refers to the actual value of the forecast (as the name implies.)
I need to determine for a given stationid
and a given forecastdate
and forecastinterval
pair, the records where a forecastvalue
increments more than a nominal number (say 500). I'll show a table of the condition here:
stationid forecastdate forecastinterval forecastcreated forecastvalue
---------------------------------------------------------------------------------
'stationa' 13-dec-09 10 10-dec-09 04:50:10 0
'stationa' 13-dec-09 10 10-dec-09 17:06:13 0
'stationa' 13-dec-09 10 12-dec-09 05:20:50 300
'stationa' 13-dec-09 10 13-dec-09 09:20:50 300
In the above scenario, I'd like to pull out the third record. This is the record where the forecast value increased by a nominal (say 100) amount.
The task is proving to be very difficult due to the sheer size of the table (many 100s of millions of records.) and taking so long to finish (so long in fact that my query has never returned.)
Here is my attempt so far to grab these values:
select
wtr.stationid,
wtr.forecastcreated,
wtr.forecastvalue,
(wtr.forecastdate + wtr.forecastinterval / 24) fcst_date
from
(select inner.*
rank() over (partition by stationid,
(inner.forecastdate + inner.forecastinterval),
inner.forecastcreated
order by stationid,
(inner.forecastdate + inner.forecastinterval) asc,
inner.forecastcreated asc
) rk
from weathertable inner) wtr
where
wtr.forecastvalue - 100 > (
select lastvalue
from (select y.*,
rank() over (partition by stationid,
(forecastdate + forecastinterval),
forecastcreated
order by stationid,
(forecastdate + forecastinterval) asc,
forecastcreated asc) rk
from weathertable y
) z
where z.stationid = wtr.stationid
and z.forecastdate = wtr.forecastdate
and (z.forecastinterval =
wtr.forecastinterval)
/* here is where i try to get the 'previous' forecast value.*/
and wtr.rk = z.rk + 1)