views:

251

answers:

4

This question is related to a previous question of mine

That's my current code

 IEnumerable<Shape> Get()
 {
     while(//get implementation
         yield return new Shape(//...
 }

 void Insert()
 {
       var actual = Get();
       using (var db = new DataClassesDataContext())
       {
           db.Shapes.InsertAllOnSubmit(actual);
           db.SubmitChanges();
       }
 }

I'm getting a memory overflow, since the IEnumerable is too big. How do I prevent it?

+3  A: 

One option is to break it up into multiple batches. Create a temporary buffer of Shape objects, iterate until you fill it or run out from the enumerator, then do a InsertBatchOnSubmit.

Erich Mirabal
How would I get all elements in groups of 5?
Jader Dias
I understood that InsertBatchOnSubmit would be a InsertAllOnSubmit with less elements
Jader Dias
Earwicker's link has an excellent example. I am not sure that will help you since you are doing deferred execution, though. You might have to have a List<Shape> and batchSize = 5 outside of a loop. Add items from your enumerator, insert once the count reach the batchSize, and then clear out the previous batch. Is that what you were asking?
Erich Mirabal
+3  A: 

Try using InsertOnSubmit rather than InsertAllOnSubmit. And then commit at appropriate intervals, like Erich said.

Or, if you want to do it in batches of e.g. 5, try Handcraftsman's or dtb's solutions for getting IEnumerable's of IEnumerable. E.g., with dtb's Chunk:

   var actual = Get();
   using (var db = new DataClassesDataContext())
   {
       foreach(var batch in actual.Chunk(5))
       {
         db.Shapes.InsertAllOnSubmit(batch);
         db.SubmitChanges();
       }
   }
Matthew Flaschen
+1  A: 

For a neat way to get batches of items from an IEnumerable, see this:

http://stackoverflow.com/questions/1008785/c-cleanest-way-to-divide-a-string-array-into-n-instances-n-items-long/1008974#1008974

Update: No good, that works on arrays. If I have some time later and no one else has provided something, I'll write it up...

Daniel Earwicker
@ nice idea man!
Jader Dias
Would that work in his case? He doesn't know the size since he only has an IEnumerable.
Erich Mirabal
See the 'Update' part of my answer! I'm cooking dinner right now...
Daniel Earwicker
Eric Lippert pointed Erich at the solution (http://stackoverflow.com/questions/1008785#answer-1008855). dtb gave a function that takes an IEumearable<T> and returns an IEnumerable<IEnumerable<T>>. Each of the inner IEnumerable<T> has up to (e.g.) 5 elements.
Matthew Flaschen
+1  A: 

Use the following extension method to break the input into appropriately sized subsets

public static class IEnumerableExtensions
{
    public static IEnumerable<List<T>> InSetsOf<T>(this IEnumerable<T> source, int max)
    {
     List<T> toReturn = new List<T>();
     foreach(var item in source)
     {
      toReturn.Add(item);
      if (toReturn.Count == max)
      {
       yield return toReturn;
       toReturn = new List<T>();
      }
     }
     if (toReturn.Any())
     {
      yield return toReturn;
     }
    }
}

then persist the subsets

void Insert()
{
    var actual = Get();
    using (var db = new DataClassesDataContext())
    {
     foreach (var set in actual.InSetsOf(5))
     {
      db.Shapes.InsertAllOnSubmit(set);
      db.SubmitChanges();
     }
    }
}

You might also find this MSDN article on InsertOnSubmit() vs InsertAllOnSubmit() to be useful.

Handcraftsman
Use toReturn.Clear() rather than toReturn = new List to avoid overhead. This is similar to http://stackoverflow.com/questions/1008785/c-cleanest-way-to-divide-a-string-array-into-n-instances-n-items-long/1008974#answer-1008855 but a little more explicit.
Matthew Flaschen
Clearing the list instead of creating a new one has the side effect of unexpectedly changing the result that was previously returned, a problem if it had not yet been consumed by the caller. For example: Enumerable.Range(1, 100).InSetsOf(5).InSetsOf(5).ToList().ForEach(x => Console.WriteLine(x.First().First() + "-" + x.Last().Last())); gets 1-25 26-50 51-75 76-100as coded, but 21-25 46-50 71-75 96-100if the list is only cleared. Also, since GroupBy is not used it can lazily return results instead of consuming the entire input first.
Handcraftsman