tags:

views:

455

answers:

2

I am having trouble writing a query to select one row per "date", given certain conditions. My table has this structure:

ID  date       expiration    callput    iv    delta
1   1/1/2009   1/20/2009     C          0.4    0.61
2   1/1/2009   1/20/2009     C          0.3    0.51
3   1/1/2009   2/20/2009     C          0.2    0.41

I would like to write a query with the following characteristics:

  • For each row, calculate the "days", i.e. the expiration date minus the date. For instance, for row one, the "days" is 19 (1/20 minus 1/1)
  • The result set should only have rows with a "days" of between 15 and 50
  • The "callput" value must be "C"
  • For each date, show only one row. That row should have the following characteristics:
    • The delta should be greater than 0.5
    • The delta should be the smallest number greater than 0.5
    • If there are two rows, the row with the lower days should be selected

Here is 'days' for the sample data above:

ID  date       expiration    days    callput    iv    delta
1   1/1/2009   1/20/2009     19      C          0.4    0.61
2   1/1/2009   1/20/2009     19      C          0.3    0.51
3   1/1/2009   2/20/2009     50      C          0.2    0.41

For my sample dataset, the answer should be row 2, because row 2's "delta" is above 0.5, row 2's delta of 0.51 is closer to 0.5 than row 1's 0.61, and row 2's "days" of 19 is less than row 3's "days" of 50.

This is the query I've written so far:

SELECT date, Min(delta) AS MaxOfdelta, [expiration]-[date] AS days
FROM RAWDATA
WHERE (((delta)>0.5) AND ((callput)="C") AND (([expiration]-[date])>=15 And ([expiration]-[date])<=50))
GROUP BY date, [expiration]-[date]
ORDER BY date;

This works somewhat, but sometimes, there are multiple rows for one date, because two rows on a given day can have a "days" between 15 and 50. I can't get my query to obey the rule "If there are two rows, the row with the lower days should be selected". I would also like the "iv" value for that row to be present in my query result set.

I happen to be using Microsoft Access, but syntax for any SQL engine would be appreciated! :-)

+3  A: 

What you can do is select the right rows in a subquery. This query should find the rows you're looking for:

select [date], min([expiration]-[date])
from rawdata
where delta > 0.5
and callput = 'C' 
and [expiration]-[date] between 15 and 50
group by [date]

To find the delta that belongs to these rows, put it in a subquery and join on it:

select *
from rawdata
inner join (
    select [date]
    ,      min([expiration]-[date]) as days
    from rawdata
    where delta > 0.5
    and callput = 'C' 
    and [expiration]-[date] between 15 and 50
    group by [date]
) as filter 
on filter.date = rawdata.date
and filter.days = rawdata.[expiration] - rawdata.[date]
where delta > 0.5
and callput = 'C'

To search for the lowest delta within rows with identical "days", you could add another subquery:

select
    SubDaysDelta.date
,   SubDaysDelta.MinDays
,   SubDaysDelta.MinDelta
,   min(rawdata.iv) as MinIv
from rawdata
inner join (
    select 
        SubDays.date
    ,   SubDays.MinDays
    ,   min(delta) as MinDelta
    from rawdata
    inner join (
        select [date]
        ,      min([expiration]-[date]) as MinDays
        from rawdata
        where delta > 0.5
        and callput = 'C' 
        and [expiration]-[date] between 15 and 50
        group by [date]
    ) as SubDays
    on SubDays.date = rawdata.date
    and SubDays.MinDays = rawdata.[expiration] - rawdata.[date]
    where delta > 0.5
    and callput = 'C' 
    group by SubDays.date, SubDays.MinDays
) as SubDaysDelta
on SubDaysDelta.date = rawdata.date
and SubDaysDelta.MinDays = rawdata.[expiration] - rawdata.[date]
and SubDaysDelta.MinDelta = rawdata.delta
where delta > 0.5
and callput = 'C' 
group by SubDaysDelta.date, SubDaysDelta.MinDays, SubDaysDelta.MinDelta

The first subquery "SubDays" searches for rows with the lowest "days". The second subquery "SubDaysDelta" searches for the lowest delta within the "SubDays" set. The outer query filters any duplicates remaining.

It would be more readable and maintainable if you'd use views. The first view could filter on callput and the 15-20 "days" limit. That'd make it a lot easier.

Andomar
Wow, this is almost it! I had to add an "AS" between ) and filter (right after the subquery). However, Access is now choking on this line: "AND filter.days = MIN([expiration]-[date])". It keeps on saying: "JOIN expression not supported". What am I missing? Thanks!
bobbyh
Oops, there shouldn't be a MIN() there. Edited the query.
Andomar
Thanks Andomar! Unfortunately, there are multiple rows in the `rawdata` table with the same number of "days". Is there a way in the first subquery to retrieve the ID column so we can join against that? :-)
bobbyh
You can't select the ID because you're not grouping on it; and grouping on ID would defeat the purpose. The only way I can think of is adding another subquery. Added that to the answer.
Andomar
Wow, Andomar, thanks again! This worked great!There were 3 slight nits I made: 1. There were a few references to "SybDaysDelta" that I changed to "SubDaysDelta" 2. I changed where it said "and SubDays.*days* = rawdata.[expiration] - rawdata.[date]" to "and SubDays.*MinDays* = rawdata.[expiration] - rawdata.[date]", and 3. I changed "between 15 and 20" to "between 15 and 50".Thanks again Andomar!
bobbyh
A: 

VBA!

I wish I could be as thorough, dedicated and helpful a servant as Andomar. I can only up-vote his answer in sheer awe of him.

However ... I would point out there are perhaps compelling reasons to switch to VBA. Even if you are new to VBA, the benefits in control and trouble shooting may put you ahead. And I'd guess any new learning will help elsewhere in your project.

I wish I would provide a complete answer as Andomar did. But give it a whack.

Smandoli