views:

85

answers:

3

I have a requirement to get txns on a T-5 basis. Meaning I need to "go back" 5 business days.

I've coded up two SQL queries for this and the second method is 5 times slower than the first.

How come?

-- Fast
with
BizDays as
( select top 5 bdate bdate
  from  dbo.business_days 
  where bdate < '20091211'
order by bdate Desc
)
,BizDate as ( select min(bdate) bdate  from BizDays)
select t.* from txns t
join BizDate on t.bdate <= BizDate.bdate

-- Slow
with
BizDays as
( select dense_rank() Over(order by bdate Desc) RN
     , bdate
  from  dbo.business_days 
  where bdate < '20091211'
)
,BizDate as ( select bdate from BizDays where RN = 5)
select t.* from txns t 
join BizDate on t.bdate <= BizDate.bdate
+3  A: 

DENSE_RANK does not stop after the first 5 records like TOP 5 does.

Though DENSE_RANK is monotonic and hence theoretically could be optimized to TOP WITH TIES, SQL Server's optimizer is not aware of that and does not do this optimization.

If your business days are unique, you can replace DENSE_RANK with ROW_NUMBER and get the same performance, since ROW_NUMBER is optimized to a TOP.

Quassnoi
My first attempt with the ranking functions was with ROW_NUMBER - and it gave the same perf and DENSE_RANK
Scott Weinstein
A: 

instead of putting the conditions in where and join clauses, could you perhaps use ORDER BY on your meeting data and then LIMIT offset, rowcount?

fsb
This is `SQL Server`, no `LIMIT` there.
Quassnoi
A: 

The reason this is running so slow is that DENSE_RANK() and ROW_NUMBER() are functions. The engine has to read every record in the table that matches the WHERE clause, apply the function to each row, save the function value, and then get the top 5 from that list.

A "plain" top 5 uses the index on the table to get the first 5 records that meet the WHERE clause. In the best case, the engine may only have to read a couple of index pages. Worst case, it may have to read a few data pages as well. Even without an index, the engine is reading the rows but does not have to execute the function or work with temporary tables.

Darryl Peterson