views:

133

answers:

4

My question is concerning SQL connection status, load, etc. based on the following code:

public IEnumberable<MyType> GetMyTypeObjects()
{
  string cmdTxt = "select * from MyObjectTable";

  using(SqlConnection conn = new SqlConnection(connString))
  {
    using(SqlCommand cmd = new SqlCommand(cmdTxt, conn))
    {
      conn.Open();
      using(SqlDataReader reader = cmd.ExecuteReader())
      {
         while(reader.Read())
         {
            yield return Mapper.MapTo<MyType>(reader);
         }
       }
    }
  }
  yield break;
}

I can see this possibly being a problem if there are many processes running similar code with long execution times between iterations of the IEnumerable object, because the connections will be open longer, etc. However, it also seems plausible that this will reduce the CPU usage on the SQL server because it only returns data when the IEnumerable object is used. It also lowers memory usage on the client because the client only has to load one instance of MyType while it works rather than loading all occurrences of MyType (by iterating through the entire DataReader and returning a List or something).

  • Are there any instances you can think of where you would not want to use IEnumerable in this manner, or any instances you think it fits perfectly?

  • What kind of load does this put on the SQL server?

  • Is this something you would use in your own code (barring any mention of NHibernate, Subsonic, etc)?

  • -
+2  A: 

I recommend against pre-optimizing. In many situations, the connections will be pooled.

I also don't expect any difference in load on SQL Server - the query will already have been compiled, and will be running.

John Saunders
I'm not sure what you mean by pre-optimizing (I'm not trying to).I was thinking load would be affected because the server would be maintaining cursor position on more connections at once.
scottm
By pre-optimizing, I mean that you're worrying about performance issues before your code works. You're thinking about problems that may not exist.
John Saunders
I understand. I also don't want to code myself into a corner, having to redesign my data access methods and the methods that use them in the future.
scottm
+6  A: 

This is not a pattern I would follow. I would not worry as much about load on the server as I would about locks. Following this pattern integrates the data retrieval process into your business logic flow, and that seems like an all-around recipe for trouble; you have no idea what happens on the iteration side, and you're inserting yourself into it. Retrieve your data in one shot, then allow the client code to enumerate over it once you've closed the reader.

Adam Robinson
Locks on the SQL objects? I didn't consider that. What do you think about giving a nolock hint (which would only be viable under certain circumstances)?
scottm
That would (obviously) eliminate the lock issue, but you're still keeping the reader open for an undefined period of time (possibly forever if the consuming code forgets to dispose of the enumerator). It just seems like a relatively high risk for low gain. If there's a particular need for this practice--memory consumption I suppose could qualify--then it's not outright evil, but I definitely wouldn't make it a habit and I'd look for alternatives.
Adam Robinson
And as an aside, you are by no means guaranteed "one instance of MyType". Indeed, even if the consuming code only deals with one instance at a time, the GC isn't going to clean them up immediately. Yes, you will have them fall out of scope and be eligible for collection sooner, but it isn't going to be a constant memory footprint.
Adam Robinson
@scottm - the nolock hint is mis-used far too often. If you intend to use it - you should make sure that you fully understand all of the implications of it to your app.
Scott Ivey
@Scott Ivey, I understand the implications. That's why I said it would only be viable in certain situations.
scottm
Agreed with Adam (as supplemented by Guffa) - in cases where there's a proven need to reduce memory use, it would be better for a client to obtain an `IDataReader`, utilize the `Mapper`, and manage the resource disposal itself.
Jeff Sternal
I can't explain how much I agree with this answer. +OVER9000
DrJokepu
This method style is actually used in a data factory class that takes a string and returns IEnumerable to repositories that exposed client side. These methods really aren't accessible to client usage.
scottm
Been here, done that, still kicking myself... A big problem we have seen is people only reading the first record and breaking from the loop. This leaves the connection open until the GC gets caught up. This more often than not runs out of connections in the connection pool. My STRONG recommendation is never return control to external code while the connection is open.-- Just my 2 cents
csharptest.net
+2  A: 

I wouldn't use it, because it hides what's happening, and it could possibly leave database connections without proper disposal.

The connection object will be closed when you have read past the last record, and if you stop reading before that the connection object won't be disposed. If you for example know that you always have ten records in a result and just have a loop that reads those ten records from the enumerator without making the eleventh Read call that reads beyond the last item, the connection is not closed properly. Also, if you wanted to use only part of the result, you have no way of closing the connection without reading the rest of the records.

Even the built in extensions for the enumerators may cause this even if you seamlingy use them correctly:

foreach (MyType item in GetMyTypeObjects().Take(10)) {
   ...
}
Guffa
That's a good point, hadn't considered it either. I'm so blinded by the way I'd use it!
scottm
@Guffa: Using extension methods such as `Take` wouldn't be a problem: The `GetMyTypeObjects` method is just syntactic sugar which creates an `IDisposable` iterator object. `Take` will call the iterator's `Dispose` method when it's done, and that will then dispose of the connection, command, reader etc.
LukeH
@Guffa: And the same applies to any other partial reads of the resultset: So long as you call `Dispose` when you're done (or preferably just wrap everything in a `using` block) then the connection, command, reader etc will also be disposed.
LukeH
@Luke: Yes, you are right. I tested this, and somehow it manages to create a Dispose method that seems to do the right thing, but I'm not sure how it knows which objects to dispose...
Guffa
@Guffa: When the compiler rewrites the iterator method as a class, all local variables in the method become fields of the class. I'm guessing that the compiler identifies any locals that would've been disposed in the method and disposes of the equivalent fields in the class's `Dispose` method.
LukeH
Eric Lippert has some great posts about the inner workings of iterators: http://blogs.msdn.com/ericlippert/archive/tags/Iterators/default.aspx
LukeH
+1  A: 

My concern would be that you're putting yourself completely at the mercy of the client code.

You've already mentioned that the calling code might hold the connection open longer than is strictly necessary, but there's also the possibility that it would never allow the objects to be closed/disposed.

So long as the client code uses foreach, using etc or explicitly calls the enumerator's Dispose method then you're fine, but there's nothing to stop it doing something like this:

var e = GetMyTypeObjects().GetEnumerator();
e.MoveNext();    // open the connection etc

// forget about the enumerator and go away and do something else
// now the reader, command and connection won't be closed/disposed
// until the GC kicks in and calls their finalisers
LukeH