views:

89

answers:

1

I have a simple data logging table with two data columns A and B for which I want to generate the sum of A+B. However either or both A and B could be null. The table also has a timestamp column which is the primary key

I can see several ways to skin this cat, but I was curious to know what the preferred method might be to sum the data over a range of timestamps:

1. SUM(Coalesce(A,0)+Coalesce(B,0))

2. SUM(Coalesce(A+B,A,B,0))

3. Coalesce( Sum(A), 0) + Coalesce( SUM(B), 0)

I get the following results from the Execution planner:

Form Select    Compute Scalar  Stream Aggregate  Compute Scaler Clustered Index Seek
  1     0%          0%                11%             1%                 87%    
  2     0%          0%                11%             1%                 87%    

Form Select    Compute Scalar  Compute Scaler Stream Aggregate  Clustered Index Seek
  3     0%          0%                0%             12%                 88%

I have never delved into query plans very much before so I am not sure how to interpret the numbers. Would it be fair to say that in general the more pushed to the right hand side the overall percentages are skewed then the better? Making Form 3 superior to 1 and 2?

Thanks

+2  A: 

Optimize for the expected case.

Here, if both A and B have values most of the time, then #2 should do better because the coalesce will hit that first value and stop. If A or B are NULL very often you might do a little better with #1. The key here, as always, is to try both and profile actual times on actual data.

Joel Coehoorn
OK I can understand that, but what do my query plan results tell me given they were run over the same data?
Peter M
Those are _expected_ plans generated for use by the query engine _before_ actually starting to run your query. They tell you how sql server will execute your query, but don't take actual times into account. You need to look at the actual execution times. Do it on an otherwise quiet server, and do it several times to get an average.
Joel Coehoorn
@Joel .. so basically you are telling me that looking at this execution plan is not very useful and the only really meaningful determination can be obtained by running the query against real world data on a quiet server? This does seem a bit disingenuous to me.
Peter M
No. Execution plans can help you spot obvious bottlenecks: table seeks and the like. And when two plans are different they can help you tell them apart. But when two queries generate similar plans, you need to profile actual times.
Joel Coehoorn