tags:

views:

36

answers:

2

I'm profiling (SQL Server 2008) some of our views and queries to determine their efficiency with regards to CPU usage and Reads. I understand Reads are the number of logical disk reads in 8KB pages. But I'm having a hard time determining what I should be satisfied with.

For example, when I query one of our views, which in turn joins with another view and has three OUTER APPLYs with table-valued UDFs, I get a Reads value of 321 with a CPU value of 0. My first thought is that I should be happy with this. But how do I evaluate the value of 321? This tells me 2,654,208 bytes of data were logically read to satisfy the query (which returned a single row and 30 columns).

How would some of you go about determining if this is good enough, or requires more fine tuning? What criteria would you use?

Also, I'm curious what is included in the 2,654,208 bytes of logical data read. Does this include all the data contained in the 30 columns in the single row returned?

+2  A: 

321 reads with a CPU value of 0 sounds pretty good, but it all depends.

How often is this query run? Why are table-returning UDFs used instead of just doing joins? What is the context of database use (how many users, number of transactions per second, database size, is it OLTP or data warehousing)?

The extra data reads come from:

  • All the other data in the pages needed to satisfy the reads done in the execution plan. Note this includes clustered and nonclustered indexes. Examining the execution plan will give you better idea of what exactly is being read. You'll see references to all sorts of indexes and tables, and whether a seek or scan was required. Note that a scan means every page in the whole index or table was read. That is why seeks are desirable over scans.

  • All the related data in tables INNER JOINed to in the views regardless of whether these JOINs are needed to give correct results for the query you're performing, since the optimizer doesn't know that these INNER JOINs will or won't exclude/include rows until it JOINs them.

If you provide the queries and execution plans, as requested, I would probably be able to give you better advice. Since you're using table-valued UDFs, I would also need to see the UDFs themselves or at least the execution plan of the UDFs (which is only possibly by tearing out its meat and running outside a function context, or converting it to a stored procedure).

Emtucifor
+1  A: 

The 2.5MB includes all data in the 321 pages, including the other rows in the same pages as those retrieved for your query, as well as the index pages retrieved to find your data. Note that these are logical reads, not physical reads, e.g. read from a cached page will make the read much 'cheaper' - take CPU and profiler cost indicator as well when optimising.

w.r.t. How to determine an optimum 'target' for reads.

FWIW I compare the actual reads with a optimum value which I can think of as the minimum number of pages needed to return the data in your query in a 'perfect' world.

e.g. if you calculate roughly 5 rows per page from table x, and your query returns 20 rows, the 'perfect' number of reads would be 4, plus some overhead of navigating indexes (assuming of course that the rows are clustered 'perfectly' for your query) - so utopia would be around say 5-10 pages.

For a performance critical query, you can use the actual reads vs 'utopian' reads to micro-optimise, e.g.:

  • Whether I can fit more rows per page in the cluster (table), e.g. replacing non-searched strings with varchar() not char, or using varchar not nvarchar() or using smaller integer types etc.
  • Whether the clustered index could be changed such that fewer pages would need to be fetched (e.g. if the 20 rows for the above query were scattered across different pages, then reads would be > 4)
  • Failing which (since you can only one CI), whether covering indexes could replace the need to go to the table data (cluster) at all, since covering indexes fitting your query will have higher 'row' densities
  • And for indexes, density improvements such as fillfactors or narrower indexing for indexes can mean less index reads

You might find this article useful

HTH!

nonnb