I've been using Linq-to-SQL for quite awhile and it works great. However, lately I've been experimenting with using it to pull really large amounts of data and am running across some issues. (Of course, I understand that L2S may not be the right tool for this particular kind of processing, but that's why I'm experimenting - to find its limits.)
Here's a code sample:
var buf = new StringBuilder();
var dc = new DataContext(AppSettings.ConnectionString);
var records = from a in dc.GetTable<MyReallyBigTable>() where a.State == "OH" select a;
var i = 0;
foreach (var record in records) {
buf.AppendLine(record.ID.ToString());
i += 1;
if (i > 3) {
break; // Takes forever...
}
}
Once I start iterating over the data, the query executes as expected. When stepping through the code, I enter the loop right away which is exactly what I hoped for - that means that L2S appears to be using a DataReader behind the scenes instead of pulling all the data first. However, once I get to the break
, the query continues to run and pull all the rest of the records. Here are my questions for the SO community:
1.) Is there a way to stop Linq-to-SQL from finishing execution of a really big query in the middle the way you can with a DataReader
?
2.) If you execute a large Linq-to-SQL query, is there a way to prevent the DataContext from filling up with change tracking information for every object returned. Basically, instead of filling up memory, can I do a large query with short object lifecycles the way you can with DataReader
techniques?
I'm okay if this isn't functionality built-in to the DataContext
itself and requires extending the functionality with some customization. I'm just looking to leverage the simplicity and power of Linq for large queries for nightly processing tasks instead of relying on T-SQL for everything.