views:

47

answers:

1

Today I found this query in our code that pulls up a list of errors from our database:

SELECT *
FROM   (
        SELECT Substring(title, 9, Patindex('%)%', title) - 9) AS SERVICE, *
        FROM   core.LOG
        WHERE  logtypeid = 1
               AND title LIKE 'Error: (%'
               AND lastmodified > '2010-06-21T00:00:00'
               AND lastmodified < '2010-06-22T00:00:00'
        ) serviceerrors
WHERE SERVICE = 'CheckHelpDeskEmail'  

It fails with the following error:

Invalid length parameter passed to the SUBSTRING function.

If I remove the WHERE clause on the very last line it works fine. Alternatively, if I take the WHERE clause from the inner subquery and move it to the main query it works fine. So this works:

SELECT *
FROM   (
        SELECT Substring(title, 9, Patindex('%)%', title) - 9) AS SERVICE, *
        FROM   core.LOG
        ) serviceerrors
WHERE  logtypeid = 1
AND title LIKE 'Error: (%'
AND lastmodified > '2010-06-21T00:00:00'
AND lastmodified < '2010-06-22T00:00:00'
AND SERVICE = 'CheckHelpDeskEmail' 

Does anyone know why? I think the problem is that SQL is actually making multiple runs through the core.LOG table and the first time it simply runs the Substring line on the whole table which fails since some lines will not have a ')' in them. However, if it runs the remaining lines in the WHERE clause it won't find any rows missing a ')'. Then it makes another run through the core.LOG table and runs the remaining filters on it. This seems unefficient since it will end up running the Substring function on millions of rows, vs about 15 rows from the remaining filters in the WHERE clause.

Here's my XML Execution plan

    <?xml version="1.0" encoding="utf-16"?>
<ShowPlanXML xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Version="1.0" Build="9.00.3310.00" xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan"&gt;
  <BatchSequence>
    <Batch>
      <Statements>
        <StmtSimple StatementCompId="1" StatementEstRows="1" StatementId="1" StatementOptmLevel="FULL" StatementSubTreeCost="36.4574" StatementText="SELECT *&#xD;&#xA;FROM   (&#xD;&#xA;       SELECT Substring(title, 9, Patindex('%)%', title) - 9) AS SERVICE,&#xD;&#xA;               *&#xD;&#xA;        FROM   core.LOG&#xD;&#xA;        WHERE  logtypeid = 1&#xD;&#xA;               AND title LIKE 'Error: (%'&#xD;&#xA;               AND lastmodified &gt; '2010-06-21T00:00:00'&#xD;&#xA;               AND lastmodified &lt; '2010-06-22T00:00:00'&#xD;&#xA;        ) serviceerrors&#xD;&#xA;WHERE SERVICE = 'CheckHelpDeskEmail'  &#xD;&#xA;" StatementType="SELECT">
          <StatementSetOptions ANSI_NULLS="false" ANSI_PADDING="false" ANSI_WARNINGS="false" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="false" NUMERIC_ROUNDABORT="false" QUOTED_IDENTIFIER="false" />
          <QueryPlan CachedPlanSize="47" CompileTime="10" CompileCPU="7" CompileMemory="544">
            <RelOp AvgRowSize="4544" EstimateCPU="1E-07" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="1" LogicalOp="Compute Scalar" NodeId="0" Parallel="false" PhysicalOp="Compute Scalar" EstimatedTotalSubtreeCost="36.4574">
              <OutputList>
                <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LogId" />
                <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="Title" />
                <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="Details" />
                <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="IdentifierHash" />
                <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LastModifier" />
                <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LastModified" />
                <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LogTypeId" />
                <ColumnReference Column="Expr1003" />
              </OutputList>
              <ComputeScalar>
                <DefinedValues>
                  <DefinedValue>
                    <ColumnReference Column="Expr1003" />
                    <ScalarOperator ScalarString="substring([Aqueduct].[Core].[Log].[Title],(9),patindex(N'%)%',[Aqueduct].[Core].[Log].[Title])-(9))">
                      <Intrinsic FunctionName="substring">
                        <ScalarOperator>
                          <Identifier>
                            <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="Title" />
                          </Identifier>
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="(9)" />
                        </ScalarOperator>
                        <ScalarOperator>
                          <Arithmetic Operation="SUB">
                            <ScalarOperator>
                              <Intrinsic FunctionName="patindex">
                                <ScalarOperator>
                                  <Const ConstValue="N'%)%'" />
                                </ScalarOperator>
                                <ScalarOperator>
                                  <Identifier>
                                    <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="Title" />
                                  </Identifier>
                                </ScalarOperator>
                              </Intrinsic>
                            </ScalarOperator>
                            <ScalarOperator>
                              <Const ConstValue="(9)" />
                            </ScalarOperator>
                          </Arithmetic>
                        </ScalarOperator>
                      </Intrinsic>
                    </ScalarOperator>
                  </DefinedValue>
                </DefinedValues>
                <RelOp AvgRowSize="4342" EstimateCPU="0.0288036" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="1" LogicalOp="Gather Streams" NodeId="1" Parallel="true" PhysicalOp="Parallelism" EstimatedTotalSubtreeCost="36.4574">
                  <OutputList>
                    <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LogId" />
                    <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="Title" />
                    <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="Details" />
                    <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="IdentifierHash" />
                    <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LastModifier" />
                    <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LastModified" />
                    <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LogTypeId" />
                  </OutputList>
                  <Parallelism>
                    <RelOp AvgRowSize="4342" EstimateCPU="0.0128582" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="1" LogicalOp="Filter" NodeId="2" Parallel="true" PhysicalOp="Filter" EstimatedTotalSubtreeCost="36.4286">
                      <OutputList>
                        <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LogId" />
                        <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="Title" />
                        <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="Details" />
                        <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="IdentifierHash" />
                        <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LastModifier" />
                        <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LastModified" />
                        <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LogTypeId" />
                      </OutputList>
                      <Filter StartupExpression="false">
                        <RelOp AvgRowSize="4342" EstimateCPU="0.0235734" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="11279.1" LogicalOp="Inner Join" NodeId="3" Parallel="true" PhysicalOp="Nested Loops" EstimatedTotalSubtreeCost="36.4157">
                          <OutputList>
                            <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LogId" />
                            <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="Title" />
                            <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="Details" />
                            <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="IdentifierHash" />
                            <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LastModifier" />
                            <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LastModified" />
                            <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LogTypeId" />
                          </OutputList>
                          <NestedLoops Optimized="true" WithUnorderedPrefetch="true">
                            <OuterReferences>
                              <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LogId" />
                              <ColumnReference Column="Expr1004" />
                            </OuterReferences>
                            <RelOp AvgRowSize="19" EstimateCPU="0.00628203" EstimateIO="0.0186806" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="11279.1" LogicalOp="Index Seek" NodeId="6" Parallel="true" PhysicalOp="Index Seek" EstimatedTotalSubtreeCost="0.0249626">
                              <OutputList>
                                <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LogId" />
                                <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LastModified" />
                              </OutputList>
                              <IndexScan Ordered="true" ScanDirection="FORWARD" ForcedIndex="false" NoExpandHint="false">
                                <DefinedValues>
                                  <DefinedValue>
                                    <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LogId" />
                                  </DefinedValue>
                                  <DefinedValue>
                                    <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LastModified" />
                                  </DefinedValue>
                                </DefinedValues>
                                <Object Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Index="[IX_LastModified]" />
                                <SeekPredicates>
                                  <SeekPredicate>
                                    <StartRange ScanType="GT">
                                      <RangeColumns>
                                        <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LastModified" />
                                      </RangeColumns>
                                      <RangeExpressions>
                                        <ScalarOperator ScalarString="'2010-06-21 00:00:00.000'">
                                          <Const ConstValue="'2010-06-21 00:00:00.000'" />
                                        </ScalarOperator>
                                      </RangeExpressions>
                                    </StartRange>
                                    <EndRange ScanType="LT">
                                      <RangeColumns>
                                        <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LastModified" />
                                      </RangeColumns>
                                      <RangeExpressions>
                                        <ScalarOperator ScalarString="'2010-06-22 00:00:00.000'">
                                          <Const ConstValue="'2010-06-22 00:00:00.000'" />
                                        </ScalarOperator>
                                      </RangeExpressions>
                                    </EndRange>
                                  </SeekPredicate>
                                </SeekPredicates>
                              </IndexScan>
                            </RelOp>
                            <RelOp AvgRowSize="4447" EstimateCPU="0.0001581" EstimateIO="0.003125" EstimateRebinds="11278.1" EstimateRewinds="0" EstimateRows="1" LogicalOp="Clustered Index Seek" NodeId="8" Parallel="true" PhysicalOp="Clustered Index Seek" EstimatedTotalSubtreeCost="36.3672">
                              <OutputList>
                                <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="Title" />
                                <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="Details" />
                                <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="IdentifierHash" />
                                <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LastModifier" />
                                <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LogTypeId" />
                              </OutputList>
                              <IndexScan Lookup="true" Ordered="true" ScanDirection="FORWARD" ForcedIndex="false" NoExpandHint="false">
                                <DefinedValues>
                                  <DefinedValue>
                                    <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="Title" />
                                  </DefinedValue>
                                  <DefinedValue>
                                    <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="Details" />
                                  </DefinedValue>
                                  <DefinedValue>
                                    <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="IdentifierHash" />
                                  </DefinedValue>
                                  <DefinedValue>
                                    <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LastModifier" />
                                  </DefinedValue>
                                  <DefinedValue>
                                    <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LogTypeId" />
                                  </DefinedValue>
                                </DefinedValues>
                                <Object Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Index="[PK_Core_Log]" TableReferenceId="-1" />
                                <SeekPredicates>
                                  <SeekPredicate>
                                    <Prefix ScanType="EQ">
                                      <RangeColumns>
                                        <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LogId" />
                                      </RangeColumns>
                                      <RangeExpressions>
                                        <ScalarOperator ScalarString="[Aqueduct].[Core].[Log].[LogId]">
                                          <Identifier>
                                            <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LogId" />
                                          </Identifier>
                                        </ScalarOperator>
                                      </RangeExpressions>
                                    </Prefix>
                                  </SeekPredicate>
                                </SeekPredicates>
                              </IndexScan>
                            </RelOp>
                          </NestedLoops>
                        </RelOp>
                        <Predicate>
                          <ScalarOperator ScalarString="substring([Aqueduct].[Core].[Log].[Title],(9),patindex(N'%)%',[Aqueduct].[Core].[Log].[Title])-(9))=N'CheckHelpDeskEmail' AND [Aqueduct].[Core].[Log].[Title] like N'Error: (%' AND [Aqueduct].[Core].[Log].[LogTypeId]=(1)">
                            <Logical Operation="AND">
                              <ScalarOperator>
                                <Compare CompareOp="EQ">
                                  <ScalarOperator>
                                    <Intrinsic FunctionName="substring">
                                      <ScalarOperator>
                                        <Identifier>
                                          <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="Title" />
                                        </Identifier>
                                      </ScalarOperator>
                                      <ScalarOperator>
                                        <Const ConstValue="(9)" />
                                      </ScalarOperator>
                                      <ScalarOperator>
                                        <Arithmetic Operation="SUB">
                                          <ScalarOperator>
                                            <Intrinsic FunctionName="patindex">
                                              <ScalarOperator>
                                                <Const ConstValue="N'%)%'" />
                                              </ScalarOperator>
                                              <ScalarOperator>
                                                <Identifier>
                                                  <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="Title" />
                                                </Identifier>
                                              </ScalarOperator>
                                            </Intrinsic>
                                          </ScalarOperator>
                                          <ScalarOperator>
                                            <Const ConstValue="(9)" />
                                          </ScalarOperator>
                                        </Arithmetic>
                                      </ScalarOperator>
                                    </Intrinsic>
                                  </ScalarOperator>
                                  <ScalarOperator>
                                    <Const ConstValue="N'CheckHelpDeskEmail'" />
                                  </ScalarOperator>
                                </Compare>
                              </ScalarOperator>
                              <ScalarOperator>
                                <Intrinsic FunctionName="like">
                                  <ScalarOperator>
                                    <Identifier>
                                      <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="Title" />
                                    </Identifier>
                                  </ScalarOperator>
                                  <ScalarOperator>
                                    <Const ConstValue="N'Error: (%'" />
                                  </ScalarOperator>
                                </Intrinsic>
                              </ScalarOperator>
                              <ScalarOperator>
                                <Compare CompareOp="EQ">
                                  <ScalarOperator>
                                    <Identifier>
                                      <ColumnReference Database="[Aqueduct]" Schema="[Core]" Table="[Log]" Column="LogTypeId" />
                                    </Identifier>
                                  </ScalarOperator>
                                  <ScalarOperator>
                                    <Const ConstValue="(1)" />
                                  </ScalarOperator>
                                </Compare>
                              </ScalarOperator>
                            </Logical>
                          </ScalarOperator>
                        </Predicate>
                      </Filter>
                    </RelOp>
                  </Parallelism>
                </RelOp>
              </ComputeScalar>
            </RelOp>
          </QueryPlan>
        </StmtSimple>
      </Statements>
    </Batch>
  </BatchSequence>
</ShowPlanXML>
+2  A: 

Your code is making invalid assumptions. In a declarative set oriented language like SQL the execution is free to choose whatever execution plan it sees fit. What you see as inefficient is most likely a valid optimization where the title is projected first from an index that satisfies the predicates on lastmodified or something similar. You cannot make any assumption about order of execution, and you are therefore not allowed to have in the projection list expressions like SUBSTRING (..,9,..) which will bomb on certain rows.

Another example of problems that arise from a similar invalid assumption are SQL Server boolean operator short-circuit evaluation bugs.

Remus Rusanu
Thanks, it makes sense that I can't assume any execution order.
kd7iwp