views:

100

answers:

3

I have a LINQ query, which for some reason is generating an extra/duplicate INNER JOIN. This is causing the query to not return the expected output. If I manually comment that extra JOIN from the generated SQL, then I get seemingly correct output.

Can you detect what I might have done in this LINQ to have caused this extra JOIN?

Thanks.

Here is my approx LINQ

predicate=predicate.And(condition1);
predicate1=predicate1.And(condition2);
predicate1=predicate1.And(condition3);
predicate2=predicate2.Or(predicate1);
predicate=predicate.And(predicate2);
var ids = context.Code.Where(predicate); 
            var rs = from r in ids 
                     group r by r.PersonID into g 
                     let matchcount=g.Select(p => p.phonenumbers.PhoneNum).Distinct().Count() 
                     where matchcount ==2 
                     select new 
                  { 
                      personid = g.Key 
                  };

and here is the generated SQL (the duplicate join is [t7])

Declare @p1 VarChar(10)='Home'
Declare @p2 VarChar(10)='111'
Declare @p3 VarChar(10)='Office'
Declare @p4 VarChar(10)='222'
Declare @p5 int=2

SELECT [t9].[PersonID] AS [pid]
FROM (
    SELECT [t3].[PersonID], (
        SELECT COUNT(*)
        FROM (
            SELECT DISTINCT [t7].[PhoneValue]
            FROM [dbo].[Person] AS [t4]
            INNER JOIN [dbo].[PersonPhoneNumber] AS [t5] ON [t5].[PersonID] = [t4].[PersonID]
            INNER JOIN [dbo].[CodeMaster] AS [t6] ON [t6].[Code] = [t5].[PhoneType]
            INNER JOIN [dbo].[PersonPhoneNumber] AS [t7] ON [t7].[PersonID] = [t4].[PersonID]
            WHERE ([t3].[PersonID] = [t4].[PersonID]) AND ([t6].[Enumeration] = @p0) AND ((([t6].[CodeDescription] = @p1) AND ([t5].[PhoneValue] = @p2)) OR (([t6].[CodeDescription] = @p3) AND ([t5].[PhoneValue] = @p4)))
            ) AS [t8]
        ) AS [value]
    FROM (
        SELECT [t0].[PersonID]
        FROM [dbo].[Person] AS [t0]
        INNER JOIN [dbo].[PersonPhoneNumber] AS [t1] ON [t1].[PersonID] = [t0].[PersonID]
        INNER JOIN [dbo].[CodeMaster] AS [t2] ON [t2].[Code] = [t1].[PhoneType]
        WHERE ([t2].[Enumeration] = @p0) AND ((([t2].[CodeDescription] = @p1) AND ([t1].[PhoneValue] = @p2)) OR (([t2].[CodeDescription] = @p3) AND ([t1].[PhoneValue] = @p4)))
        GROUP BY [t0].[PersonID]
        ) AS [t3]
    ) AS [t9]
WHERE [t9].[value] = @p5
A: 

My gut feeling is that the .DISTINCT().COUNT() is treated separately by the linq to sql translation.

I'd also wager that the execution plan on SQL just threw out the dupe.

DigDoug
A: 

Try to rewrite with explicit condition instead of thah abstract "predicate" construction. From what I see in SQL that composition might look weird to a parser in isolation and one join [t5] which you just called dupe :-) is there to serve that condition.

Also, try to tell us what tit you really want to find with that query and try to write normal SQL that does what you wanted. I'm supposed to be human :-) and it look weird to me as well :-))

Technically speaking, you forced double joint by using a condition on in in two separate queries (every var assignment it technically separate query).

Also doing group by a column without doing any aggregation is not alway equivalent to select distinct. In particular select distinct on a join is allowed to take precedence over a join - queries are declatative (can undergo reorderings) and you were trying to force it to be procedural. So LINQ gave you exact procedural :-) and then SQL reordered according to SQL rules :-))

So, just write normal SQL first, and if you can't LINQ-ize it put it into sproc - it's going to make it faster anyway :-)

ZXX
A: 

They aren't being duplicated. You are asking for two different values from the data source.

let matchcount=g.Select(p => p.phonenumbers.PhoneNum).Distinct().Count()

is causing

        SELECT COUNT(*) 
        FROM ( 
            SELECT DISTINCT [t7].[PhoneValue] 
            FROM [dbo].[Person] AS [t4] 
            INNER JOIN [dbo].[PersonPhoneNumber] AS [t5] ON [t5].[PersonID] = [t4].[PersonID] 
            INNER JOIN [dbo].[CodeMaster] AS [t6] ON [t6].[Code] = [t5].[PhoneType] 
            INNER JOIN [dbo].[PersonPhoneNumber] AS [t7] ON [t7].[PersonID] = [t4].[PersonID] 
            WHERE ([t3].[PersonID] = [t4].[PersonID]) AND ([t6].[Enumeration] = @p0) AND ((([t6].[CodeDescription] = @p1) AND ([t5].[PhoneValue] = @p2)) OR (([t6].[CodeDescription] = @p3) AND ([t5].[PhoneValue] = @p4))) 
            ) AS [t8] 

and

                     from r in ids   
                     group r by r.PersonID into g 

is causing

    SELECT [t0].[PersonID]    
    FROM [dbo].[Person] AS [t0]    
    INNER JOIN [dbo].[PersonPhoneNumber] AS [t1] ON [t1].[PersonID] = [t0].[PersonID]    
    INNER JOIN [dbo].[CodeMaster] AS [t2] ON [t2].[Code] = [t1].[PhoneType]    
    WHERE ([t2].[Enumeration] = @p0) AND ((([t2].[CodeDescription] = @p1) AND ([t1].[PhoneValue] = @p2)) OR (([t2].[CodeDescription] = @p3) AND ([t1].[PhoneValue] = @p4)))    
    GROUP BY [t0].[PersonID]    
    ) AS [t3]   

as for the INNER JOINS, the reason you are getting them is because of the relationship between those tables. For instance Person is 1..1 with PersonPhoneNumber (or 1..*). In either case I assume PersonID on PersonPhoneNumber is an FK and a PK value. So in that case the data source has to go out to that external table to see if the value for the PersonPhoneNumber navigation property actually exists. It does this by performing an INNER JOIN on that table.

jwendl