Your description of (1) is correct, but it is an example of Eager Loading rather than Lazy Loading.
Your description of (2) is incorrect. (2) is technically using no loading at all, but will use Lazy Loading if you try to access any non-scalar values on your FlaggedDates.
In either case, you are correct that no data will be loaded from your data store until you attempt to "do something" with the _flaggedDates. However, what happens is different in each case.
(1): Eager loading: as soon as you begin your for
loop, every one of the objects that you have specified will get pulled from the database and built into a gigantic in-memory data structure. This will be a very expensive operation, pulling an enormous amount of data from your database. However, it will all happen in one database round trip, with a single SQL query getting executed.
(2): Lazy loading: When your for
loop begins, it will only load the FlaggedDates objects. However, if you access related objects inside your for
loop, it will not have those objects loaded into memory yet. The first attempt to retrieve the scheduledSchools for a given FlaggedDate will result in either a new database roundtrip to retrieve the schools, or an Exception being thrown because your context has already been disposed. Since you'd be accessing the scheduledSchools collection inside a for
loop, you would have a new database round trip for every FlaggedDate that you initially loaded at the beginning of the for
loop.
Reponse to Comments
Disabling Lazy Loading is not the same as Enabling Eager Loading. In this example:
context.ContextOptions.LazyLoadingEnabled = false;
var schools = context.FlaggedDates.First().scheduledSchools;
The schools
variable will contain an empty EntityCollection, because I didn't Include
them in the original query (FlaggedDates.First()), and I disabled lazy loading so that they couldn't be loaded after the initial query had been executed.
You are correct that the where d.dateID == 2
would mean that only the objects related to that specific FlaggedDate object would be pulled in. However, depending on how many objects are related to that FlaggedDate, you could still end up with a lot of data going over that wire. This is due to the way the EntityFramework builds out its SQL query. SQL Query results are always in a tabular format, meaning you must have the same number of columns for every row. For every scheduledSchool object, there needs to be at least one row in the result set, and since every row has to contain at least some value for every column, you end up with every scalar value on your FlaggedDate object being repeated. So if you have 10 scheduledSchools and 10 interviews associated with your FlaggedDate, you'll end up with 20 rows that each contain every scalar value on FlaggedDate. Half of the rows will have null values for all the ScheduledSchool columns, and the other half will have null values for all of the Interviews columns.
Where this gets really bad, though, is if you go "deep" in the data you're including. For example, if each ScheduledSchool had a students
property, which you included as well, then suddenly you would have a row for each Student in each ScheduledSchool, and on each of those rows, every scalar value for the Student's ScheduledSchool would be included (even though only the first row's values end up getting used), along with every scalar value on the original FlaggedDate object. It can add up quickly.
It's difficult to explain in writing, but if you look at the actual data coming back from a query with multiple Include
s, you will see that there is a lot of duplicate data. You can use LinqPad to see the SQL Queries generated by your EF code.