views:

110

answers:

3

What's the best approach to call Dispose() on the elements of a sequence?

Suppose there's something like:

IEnumerable<string> locations = ...
var streams = locations.Select ( a => new FileStream ( a , FileMode.Open ) );
var notEmptyStreams = streams.Where ( a => a.Length > 0 );
//from this point on only `notEmptyStreams` will be used/visible
var firstBytes = notEmptyStreams.Select ( a => a.ReadByte () );
var average = firstBytes.Average ();

How do you dispose FileStream instances (as soon as they're no longer needed) while maintaining concise code?


To clarify: this is not an actual piece of code, those lines are methods across a set of classes, and FileStream type is also just an example.


Is doing something along the lines of:

public static IEnumerable<TSource> Where<TSource> (
            this IEnumerable<TSource> source ,
            Func<TSource , bool> predicate
        )
        where TSource : IDisposable {
    foreach ( var item in source ) {
        if ( predicate ( item ) ) {
            yield return item;
        }
        else {
            item.Dispose ();
        }
    }
}

might be a good idea?


Alternatively: do you always solve a very specific scenario with regards to IEnumerable<IDisposable> without trying to generalize? Is it so because having it is an untypical situation? Do you design around having it in the first place? If so, how?

+2  A: 

A simple solution is the following:

List<Stream> streams = locations
    .Select(a => new FileStream(a, FileMode.Open))
    .ToList();

try
{
    // Use the streams.
}
finally
{
    foreach (IDisposable stream in streams)
        stream.Dispose();
}

Note that even with this you could in theory still fail to close a stream if one of the FileStream constructors fails after others have already been constructed. To fix that you need to be more careful constructing the inital list:

List<Stream> streams = new List();
try
{
    foreach (string location in locations)
    {
        streams.Add(new FileStream(location, FileMode.Open));
    }

    // Use the streams.
}
finally { /* same as before */ }

It's a lot of code and it's not concise like you wanted but if you want to be sure that all your streams are being closed, even when there are exceptions, then you should do this.

If you want something more LINQ-like you might want to read this article by Marc Gravell:

Mark Byers
This assumes there's a specific location in code where you can put that. Also it doesn't dispose empty streams as soon as it's determined they're empty and no longer needed.
chase
+3  A: 

I suggest you turn the streams variable into an Array or a List, because enumerating through it a second time will (if I'm not mistaken) create new copies of the streams.

var streams = locations.Select(a => new FileStream(a, FileMode.Open)).ToList();
// dispose right away of those you won't need
foreach (FileStream stream in streams.Where(a => a.Length == 0))
    stream.Dispose();

var notEmptyStreams = streams.Where(a => a.Length > 0);
// the rest of your code here

foreach (FileStream stream in notEmptyStreams)
    stream.Dispose();

EDIT For these constraints, maybe LINQ isn't the best tool around. Maybe you could get away with a simple foreach loop?

var streams = locations.Select(a => new FileStream(a, FileMode.Open));
int count = 0;
int sum = 0;
foreach (FileStream stream in streams) using (stream)
{
    if (stream.Length == 0) continue;
    count++;
    sum += stream.ReadByte();
}
int average = sum / count;
zneak
+1: I think you are right about the temporary copy without the ToArray: I hadn't noticed that.
Mark Byers
Good point about iterating a second time. That's exactly why I'm looking for a pattern where instances would get disposed once they've been used, thus at the time of the second iteration the “old” instances would no longer exist (or at least not hold onto resources).
chase
@chase If this is your concern, as I said, turning the sequence into an array will prevent further evaluations of the selector, so you won't get more than one copy of each stream.
zneak
@zneak: true, but the sequence has a lot of elements, each holding onto several handles, and large memory blocks. Generators are the only option, arrays or lists won't do. And which is why I'd rather dispose them sooner than later.
chase
@chase Then maybe LINQ isn't the tool you're looking for. There are sacrifices to do if you want to achieve this kind of performance. What about running a simple foreach?
zneak
+1  A: 

I would write a method, say, AsDisposableCollection that returns a wrapped IEnumerable which also implements IDisposable, so that you can use the usual using pattern. This is a bit more work (implementation of the method), but you need that only once and then you can use the method nicely (as often as you need):

using(var streams = locations.Select(a => new FileStream(a, FileMode.Open))
                             .AsDisposableCollection()) {
  // ...
} 

The implementation would look roughly like this (it is not complete - just to show the idea):

class DisposableCollection<T> : IDisposable, IEnumerable<T> 
                                where T : IDisposable {
  IEnumerable<T> en; // Wrapped enumerable
  List<T> garbage;   // To keep generated objects

  public DisposableCollection(IEnumerable<T> en) {
    this.en = en;
    this.garbage = new List<T>();
  }
  // Enumerates over all the elements and stores generated
  // elements in a list of garbage (to be disposed)
  public IEnumerator<T> GetEnumerator() { 
    foreach(var o in en) { 
      garbage.Add(o);
      yield return o;
    }
  }
  // Dispose all elements that were generated so far...
  public Dispose() {
    foreach(var o in garbage) o.Dispose();
  }
}
Tomas Petricek
I've been thinking about that. But suppose a method returns DisposableCollection<>. This means you either forgo the ability to filter the result (or you lose the “disposability”), or lose the fluency of LINQ, or have to implement your own .Where(), etc. methods, making them return DisposableCollection<> as well.
chase