views:

171

answers:

2

I have a fairly complex linq to entities query that I display on a website. It uses paging so I never pull down more than 50 records at a time for display.

But I also want to give the user the option to export the full results to Excel or some other file format.

My concern is that there could potentially be a large # of records all being loaded into memory at one time to do this.

Is there a way to process a linq result set 1 record at a time like you could w/ a datareader so only 1 record is really being kept in memory at a time?

I've seen suggestions that if you enumerate over the linq query w/ a foreach loop that the records will not all be read into memory at once and would not overwelm the server.

Does anyone have a link to something I could read to verify this?

I'd appreciate any help.

Thanks

+2  A: 

set the ObjectContext to MergeOption.NoTracking (since it is a read only operation). If you are using the same ObjectContext for saving other data, Detach the object from the context.

how to detach

foreach( IQueryable)
{
  //do something 
  objectContext.Detach(object);
}

Edit: If you are using NoTracking option, there is no need to detach

Edit2: I wrote to Matt Warren about this scenario. And am posting relevant private correspondences here, with his approval

The results from SQL server may not even be all produced by the server yet. The query has started on the server and the first batch of results are transferred to the client, but no more are produced (or they are cached on the server) until the client requests to continue reading them. This is what is called ‘firehose cursor’ mode, or sometimes referred to as streaming. The server is sending them as fast as it can, and the client is reading them as fast as it can (your code), but there is a data transfer protocol underneath that requires acknowledgement from the client to continue sending more data.

Since IQueryable inherits from IEnumerable, I believe the underlying query sent to the server would be the same. However, when we do a IEnumerable.ToList(), the data reader, which is used by the underlying connection, would start populating the object, the objects get loaded into the app domain and might run out of memory these objects cannot yet be disposed.

When you are using foreach and IEunmerable the data reader reads the SQL result set one at a time, the objects are created and then disposed. The underlying connection might receive data in chunks and might not send a response to SQL Server back until all the chunks are read. Hence you will not run into 'out of memory` exception

Edit3:

When your query is running, you actually can open your SQL Server "Activity Monitor" and see the query, the Task State as SUSPENDED and Wait Type as Async_network_IO - which actually states that the result is in the SQL Server network buffer. You can read more about it here and here

ram
Yes, setting NoTracking on should help too. Do you know if there's a way to set it on the whole Context?I've just been setting it on the ObjectSet inside the content.
are you looking for `objectContext.MergeOption = MergeOption.NoTracking` ? http://msdn.microsoft.com/en-us/library/bb738896.aspx
ram
I don't think ObjectContext has a MergeOption property. Are you thinking of ObjectQuery instead?
sorry, I am wrong,its on the table `objectContext.Table.MergeOption=MergeOption.NoTracking` is the right way to do it
ram
OK. Now, lets say the foreach loop in your example code needs to iterate 1 million times. I just want to check that all 1 million records don't end up in the server's memory but that on each iteration only 1 "row" is loaded and then is discarded at the next iteration so long as my code doesn't choose to store the data.Can you confirm that this is the case?
A: 

Look at the return value of the LINQ query. It should be IEnumerable<>, which only loads one object at a time. If you then use something like .ToList(), they will all be loaded into memory. Just make sure your code doesn't maintain a list or use more than one instance at a time and you will be fine.

Edit: To add on to what people have said about foreach... If you do something like:

var query = from o in Objects
            where o.Name = "abc"
            select o;

foreach (Object o in query)
{
   // Do something with o
}

The query portion uses deferred execution (see examples), so the objects are not in memory yet. The foreach iterates through the results, but only getting one object at a time. query uses IEnumerator, which has Reset() and MoveNext(). The foreach calls MoveNext() each round until there are no more results.

Nelson