views:

86

answers:

4

I have a sql statement which is hardcoded in an existing VB6 app. I'm upgrading a new version in C# and using Linq To Sql. I was able to get LinqToSql to generate the same sql (before I start refactoring), but for some reason the Sql generated by LinqToSql is 5x slower than the original sql. This is running the generated Sql Directly in LinqPad.

The only real difference my meager sql eyes can spot is the WITH (NOLOCK), which if I add into the LinqToSql generated sql, makes no difference.

Can someone point out what I'm doing wrong here? Thanks!

Existing Hard Coded Sql (5.0 Seconds)

SELECT DISTINCT 
CH.ClaimNum, CH.AcnProvID, CH.AcnPatID, CH.TinNum, CH.Diag1, CH.GroupNum, CH.AllowedTotal  
FROM Claims.dbo.T_ClaimsHeader AS CH WITH (NOLOCK) 
WHERE 
CH.ContractID IN ('123A','123B','123C','123D','123E','123F','123G','123H') 
AND ( ( (CH.Transmited Is Null or CH.Transmited = '') 
AND CH.DateTransmit Is Null 
AND CH.EobDate Is Null 
AND CH.ProcessFlag IN ('Y','E') 
AND CH.DataSource NOT IN ('A','EC','EU') 
AND CH.AllowedTotal > 0 ) ) 
ORDER BY CH.AcnPatID, CH.ClaimNum

Generated Sql from LinqToSql (27.6 Seconds)

-- Region Parameters
DECLARE @p0 NVarChar(4) SET @p0 = '123A'
DECLARE @p1 NVarChar(4) SET @p1 = '123B'
DECLARE @p2 NVarChar(4) SET @p2 = '123C'
DECLARE @p3 NVarChar(4) SET @p3 = '123D'
DECLARE @p4 NVarChar(4) SET @p4 = '123E'
DECLARE @p5 NVarChar(4) SET @p5 = '123F'
DECLARE @p6 NVarChar(4) SET @p6 = '123G'
DECLARE @p7 NVarChar(4) SET @p7 = '123H'
DECLARE @p8 VarChar(1) SET @p8 = ''
DECLARE @p9 NVarChar(1) SET @p9 = 'Y'
DECLARE @p10 NVarChar(1) SET @p10 = 'E'
DECLARE @p11 NVarChar(1) SET @p11 = 'A'
DECLARE @p12 NVarChar(2) SET @p12 = 'EC'
DECLARE @p13 NVarChar(2) SET @p13 = 'EU'
DECLARE @p14 Decimal(5,4) SET @p14 = 0
-- EndRegion
SELECT DISTINCT 
[t0].[ClaimNum], 
[t0].[acnprovid] AS [AcnProvID], 
[t0].[acnpatid] AS [AcnPatID], 
[t0].[tinnum] AS [TinNum], 
[t0].[diag1] AS [Diag1], 
[t0].[GroupNum], 
[t0].[allowedtotal] AS [AllowedTotal]
FROM [Claims].[dbo].[T_ClaimsHeader] AS [t0]
WHERE 
([t0].[contractid] IN (@p0, @p1, @p2, @p3, @p4, @p5, @p6, @p7)) 
AND (([t0].[Transmited] IS NULL) OR ([t0].[Transmited] = @p8)) 
AND ([t0].[DATETRANSMIT] IS NULL) 
AND ([t0].[EOBDATE] IS NULL) 
AND ([t0].[PROCESSFLAG] IN (@p9, @p10)) 
AND (NOT ([t0].[DataSource] IN (@p11, @p12, @p13))) 
AND ([t0].[allowedtotal] > @p14)
ORDER BY [t0].[acnpatid], [t0].[ClaimNum]

New LinqToSql Code (30+ seconds... Times out )

var contractIds = T_ContractDatas.Where(x => x.EdiSubmissionGroupID == "123-01").Select(x => x.CONTRACTID).ToList();
var processFlags = new List<string> {"Y","E"};
var dataSource = new List<string> {"A","EC","EU"};

var results = (from claims in T_ClaimsHeaders
where contractIds.Contains(claims.contractid)
&& (claims.Transmited == null || claims.Transmited == string.Empty )
&& claims.DATETRANSMIT == null
&& claims.EOBDATE == null
&& processFlags.Contains(claims.PROCESSFLAG)
&& !dataSource.Contains(claims.DataSource)
&& claims.allowedtotal > 0

select new
 {
     ClaimNum = claims.ClaimNum,
     AcnProvID = claims.acnprovid,
     AcnPatID = claims.acnpatid,
     TinNum = claims.tinnum,
     Diag1 = claims.diag1,
     GroupNum = claims.GroupNum,
     AllowedTotal = claims.allowedtotal
 }).OrderBy(x => x.ClaimNum).OrderBy(x => x.AcnPatID).Distinct();

I'm using the list of constants above to make LinqToSql Generate IN ('xxx','xxx',etc) Otherwise it uses subqueries which are just as slow...

+4  A: 

Compare the execution plans for the two queires. The linqtosql query is using loads of parameters, the query optimiser will build an execution plan based on what MIGHT be in the parameters, the hard coded SQL has literal values, the query optimiser will build an execution plan based on the actual values. It is probably producing a much more eficient plan for the literal values. Your best bet is to try and spot the slow bits in the execution plan and try and get linq2sql to produce a better query. If you can't but you think you can build one by hand then create an SP, which you can then expose as a method on your data context class in linqtosql.

Ben Robinson
+2  A: 

The hard-coded values in the first SQL may be allowing the query optimizer to use indexes that it doesn't know it can efficiently use for the second, parameterised, SQL.

Another possibility is that if you're running the hand-crafted SQL in SQL Server Management Studio, the different default SET-tings of SSMS compared to the .NET SQL Server provider may be affecting performance. If this is the case, changing some of the SET-tings on the .NET connection prior to executing the command might help (e.g. SET ARITHABORT ON) but I don't know if you can do this in LinqPad. See here for more info on this possibility.

Daniel Renshaw
Thanks for your help, +1
Jason More
+1  A: 

The big difference are the parameters.

I can't know for sure without analyzing the plans, but L2S parameterizes queries so that their plans can be effectively reused, avoiding excessive query recompilation on the server. This is, in general, a Good Thing because it keeps the CPU time low on the SQL Server -- it doesn't have to keep generating and generating and generating the same plan.

But L2S goes a bit overboard when you use constants. It parameterizes them, too, which can be detrimental for performance in certain situations.

Putting on my Aluminum-Foil Clairvoyancy Hat, I'm visualizing the kinds of index structures you might have on that table. For example, you may have an index just on ProcessFlag, and there may be very few values for "Y" and "E" for ProcessFlag, causing the query with the hard-coded constants to do a scan only of the values where ProcessFlag = "Y" and "E". For the parameterized query, SQL Server generates a plan which is judged to be optimal for arbitrary input. That means that the server can't take advantage of this little hint (the constants) that you give it.

My advice to you at this point is to take a good look at your indexes and favor composite indexes which cover more of your WHERE conditions together. I will bet that with a bit of that type of analysis, you will find that the query performance becomes far more similar. (and probably improves, in both cases!)

Dave Markle
Thanks for your help, +1
Jason More
A: 

You might also check our compiled LINQ queries - http://www.jdconley.com/blog/archive/2007/11/28/linq-to-sql-surprise-performance-hit.aspx

Saurabh Kumar