views:

49

answers:

1

Ok this general problem has reared its ugly head twice in the past 6 months (different stored procedure). We have our internal users report a timeout error in the application. We can reproduce the problem in the application in a controlled environment. So we go through the normal steps of checking for blocking using sp_who2. Everything looks good with no blocking. So we do a sql trace to see exactly how the procedure is being executed. We copy that to a new window in SQL Management Studio and execute what sql trace was telling us ADO.Net was doing it and it finished in milliseconds. Our applications timeout is 30 seconds. When this problem happened a couple months back we had SQL Server 2005. We now have upgraded to SQL Server 2008 R2. Whats the next step to diagnose a problem like this?

@Martin: Thanks for the response. I will read your post in detail and let you know what i found out. Until then here is the sql in the SP you requested:

Select 
    @Exists=0, 
    @EarnRecId=0,
    @SuppStatusId = 0,
    @SLRecId = 0,
    @EarnRecDS = Null

Select
    @EarnRecId = er.EarningsId,
    @EarnRecDS = Convert(Varchar(26),er.Datestamp, 109),
    @SuppStatusId = s.SuppStatusId,
    @SLRecId = s.SLId
From
    Tracking tr
    Inner Join Supps s On s.SuppId = tr.SuppId
    Inner Join Earnings er On er.EarnRecId = s.SuppId
Where
    tr.ClaimId = @ClaimId
    and er.FiscalYr = @FiscalYr
    And er.EmplyrId In (@EmpId1,@EmpId2)

If @EarnRecId > 0
    Begin
        Set @Exists=1
    End
+2  A: 

Probably parameter sniffing.

When the stored procedure is invoked and there is no existing execution plan in the cache matching the set options for the connection a new execution plan will be compiled using the parameter values passed in on that invocation.

Sometimes this will happen when the parameters passed are atypical (e.g. have unusually high selectivity) so the generated plan will not be suitable for most other invocations with different parameters.

SSMS has a different default value for the option SET ARITH_ABORT so will not get handed the same problematic plan when you execute the stored procedure inside SSMS.

Next time it happens possibly the simplest way of investigating the issue would be to have 2 separate SSMS windows with the "Include Actual Execution Plan" option enabled and in one do

SET ARITHABORT OFF
EXEC YourProc ...

And in the other

SET ARITHABORT ON
EXEC YourProc ...

Assuming default ADO.NET and SSMS connection options the first one should use the bad plan from the cache.

If that doesn't work for you you could use profiler to see what other set options you need to fiddle with to get the bad plan or just use profiler to get the execution plans directly - or you can retrieve them from the DMVs as below.

select p.query_plan, *
from sys.dm_exec_requests r
cross apply sys.dm_exec_query_plan(r.plan_handle) p
where r.session_id = <spid of your ADO.NET connection>

You might find that the problematic plan is doing tens of thousands of individual index seeks for example whereas the good plan avoids this.

Martin Smith
anyways the solution is to put "with recompile" on the SP which i absolutely hate because its more of a hack then a solution. I don't like to put anything like that or join hints into procedures. Oh btw, its also interesting to note that after i increased the timeout and let both run and if i try to try to run them both again...both are fast until i do a "dbcc freeproccache"
You could use `OPTION (OPTIMIZE FOR (@FiscalYr UNKNOWN))` can you get the actual execution plan for the slow running invocation? If so there should be something obvious there.
Martin Smith
...By which I mean that you may find it is doing a lot more work than you are expecting and that there is significantly more than one row being returned by the join. Feel free to post the actual execution XML plan in your question.
Martin Smith