views:

530

answers:

2

In SQLSERVER 2005, I'm using table-valued function as a convenient way to perform arbitrary aggregation on subset data from large table (passing date range or such parameters).

I'm using theses inside larger queries as joined computations and I'm wondering if the query plan optimizer work well with them in every condition or if I'm better to unnest such computation in my larger queries.

  1. Does query plan optimizer unnest table-valued functions if it make sense?
  2. If it doesn't, what do you recommend to avoid code duplication that would occur by manually unnesting them?
  3. If it does, how do you identify that from the execution plan?

code sample:

create table dbo.customers (
 [key] uniqueidentifier
 , constraint pk_dbo_customers
  primary key ([key])
)
go

/* assume large amount of data */
create table dbo.point_of_sales (
 [key] uniqueidentifier
 , customer_key uniqueidentifier
 , constraint pk_dbo_point_of_sales
  primary key ([key])
)
go

create table dbo.product_ranges (
 [key] uniqueidentifier
 , constraint pk_dbo_product_ranges
  primary key ([key])
)
go

create table dbo.products (
 [key] uniqueidentifier
 , product_range_key uniqueidentifier
 , release_date datetime
 , constraint pk_dbo_products 
  primary key ([key])
 , constraint fk_dbo_products_product_range_key 
  foreign key (product_range_key) 
  references dbo.product_ranges ([key])
)
go

.

/* assume large amount of data */
create table dbo.sales_history (
 [key] uniqueidentifier
 , product_key uniqueidentifier
 , point_of_sale_key uniqueidentifier
 , accounting_date datetime
 , amount money
 , quantity int
 , constraint pk_dbo_sales_history
  primary key ([key])
 , constraint fk_dbo_sales_history_product_key
  foreign key (product_key)
  references dbo.products ([key])
 , constraint fk_dbo_sales_history_point_of_sale_key
  foreign key (point_of_sale_key)
  references dbo.point_of_sales ([key])
)
go

create function dbo.f_sales_history_..snip.._date_range
(
    @accountingdatelowerbound datetime,
         @accountingdateupperbound datetime
)
returns table as
return (
 select
                  pos.customer_key
     , sh.product_key
     , sum(sh.amount) amount
     , sum(sh.quantity) quantity
 from 
     dbo.point_of_sales pos
     inner join dbo.sales_history sh 
         on sh.point_of_sale_key = pos.[key]
 where
                  sh.accounting_date between 
                      @accountingdatelowerbound and 
                      @accountingdateupperbound
 group by
                  pos.customer_key
                  , sh.product_key
)
go

-- TODO: insert some data

-- this is a table containing a selection of product ranges
declare @selectedproductranges table([key] uniqueidentifier)

-- this is a table containing a selection of customers
declare @selectedcustomers table([key] uniqueidentifier)

declare @low datetime
 , @up datetime

-- TODO: set top query parameters

.

select
         saleshistory.customer_key
         , saleshistory.product_key
         , saleshistory.amount
         , saleshistory.quantity
from
         dbo.products p
         inner join @selectedproductranges productrangeselection 
             on p.product_range_key = productrangeselection.[key]
         inner join @selectedcustomers customerselection on 1 = 1
         inner join 
         dbo.f_sales_history_..snip.._date_range(@low, @up) saleshistory
             on saleshistory.product_key = p.[key]
             and saleshistory.customer_key = customerselection.[key]

I hope the sample makes sense.

Much thanks for your help!

+5  A: 

In this case, it's an "inline table valued function" The optimiser simply expands (unnests) it if it's useful (or view).

If the function is treated as "black box" by the outer query, the quickest way is to compare IO shown in SSMS vs IO in profiler. Profler captures "black box" IO that SSMS does not.

Blog post by Adam Mechanic (his book is in my drawer at work)

gbn
+3  A: 

1) Yes, using your syntax, it does. If you happened to use a UDF that returned a table which had conditional logic in it, it would not, though.

3) The optimizer won't point out what part of your query it's optimizing, because it may see fit to combine chunks of the plan with your function, or to optimize bits away.

Dave Markle