views:

101

answers:

2

Looking at the profiler I see a few differences. The second query which uses the include will in fact return data from related to the secondary table CountryCodes. This part makes sense to me. I don't however understand why this query has two joins. First it does a regular inner join between CountryCodes ands CountyCodeTypes (on the foreign key) which I would think be sufficient to return everything that the include requires. However it then does another outer join. Why?

var query = from codes in base.context.CountryCodes
            join codeTypes in base.context.CountryCodeTypes
            on codes.CountryCodeTypeId equals codeTypes.CountryCodeTypeId
            where codeTypes.CountryCodeTypeName == countryCodeType
            select codes;

var query = from codes in base.context.CountryCodes.Include("CountryCodeType")
            where codes.CountryCodeType.CountryCodeTypeName == countryCodeType
            select codes;

resulting sql:

 FROM   [dbo].[CountryCode] AS [Extent1]
 INNER JOIN [dbo].[CountryCodeType] AS [Extent2] ON [Extent1].[CountryCodeTypeId] = [Extent2].[CountryCodeTypeId]
 LEFT OUTER JOIN [dbo].[CountryCodeType] AS [Extent3] ON [Extent1].[CountryCodeTypeId] = [Extent3].[CountryCodeTypeId]
 WHERE [Extent2].[CountryCodeTypeName] = @p__linq__0

Also, is it fair to say that I should use the .Include only when I actually need data in the foreign key table to be populated in my result, otherwise use the join? In other words I shouldn't use the .Include as the means to the join because the navigational properties know how to join entities for me based on the keys.

+1  A: 

The left outer join happens as a result of the codes.CountryCodeType.CountryCodeTypeName == countryCodeType, whereas the inner join happens to allow it to include fields from the CountryCodeType table in the final results.

If you don't need the data from the foreign key table in your result, you don't need to use Include or Join. If you didn't use "Include", it would only use the left outer join, and not the inner join.

I'm guessing the framework simply isn't smart enough to realize that it's already done a join on that table, and can reuse the information there. Hopefully SQL Server is smart enough to pick up on that and use an execution plan that avoids duplicating that effort.

StriplingWarrior
Wouldn't I need the join in order to do my where condition though?
e36M3
Nope. The framework figures out that you're referencing a related table and creates the necessary join in SQL, just based on the property access.
StriplingWarrior
I just realized I had the inner join and outer join backwards. Kirk's answer is more correct.
StriplingWarrior
+2  A: 

This is just the nature of Entity Framework generated SQL.

The INNER JOIN exists because of your where statement.

where codes.CountryCodeType.CountryCodeTypeName == countryCodeType

The only way EF can resolve this is to perform an INNER JOIN, as you correctly point out. You are also correct in noting that the INNER JOIN does in fact return all the data required to satisfy the Include().

However the OUTER JOIN is still executed, simply because EF sees an Include() and parses that as requiring the join. Consider the case where you don't have a where clause - you'd need an OUTER JOIN then, right? Well EF isn't smart enough to determine that the OUTER JOIN isn't required in this case; it sees an Include() and then generates the relevant OUTER JOIN to ensure that the data requirements will be satisfied. In other words, it isn't considering the rest of your query to determine whether the join is required - it just does it regardless.

Regarding the Include() operator, you would only use it when you want to retrieve those related objects back to your application. It's not required for this query. The simplest query in this case would be

var query = from codes in base.context.CountryCodes
            where codes.CountryCodeType.CountryCodeTypeName == countryCodeType
            select codes;
Kirk Broadhurst
Thank you Kirk. So is it fair to say then that the query you suggested without the join is otherwise equivalent to the one I wrote with the JOIN. However your query just utilizes the navigational property inferred from the data model? I also understand what you mean regarding the `Include()` translating into an automatic OUTER JOIN. However one thing comes to mind. In any situation where there is in fact a foreign key constraint between the two tables, wouldn't an INNER JOIN be sufficient?
e36M3
1/ You can perform joins using the navigational properties described in the data model - I believe this is a central feature of EF. 2/ If you weren't doing a `where`, there'd be no INNER JOIN. And if you want to retrieve the related objects using `Include`, if EF used an INNER JOIN you'd only retrieve those objects that had related objects. OUTER JOIN ensures that you retrieve both those that have related objects, and those that don't (where the OUTER JOIN is empty).
Kirk Broadhurst