tags:

views:

61

answers:

2

In answering this question, it got me thinking...

I often use this pattern:

collectionofsomestuff //here it's LinqToEntities
    .Select(something=>new{something.Name,something.SomeGuid})
    .ToArray() //From here on it's LinqToObjects
    .Select(s=>new SelectListItem()
        {
            Text = s.Name, 
            Value = s.SomeGuid.ToString(), 
            Selected = false
        })

Perhaps I'd split it over a couple of lines, but essentially, at the ToArray point, I'm effectively enumerating my query and storing the resulting sequence so that I can further process it with all the goodness of a full CLR to hand.

As I have no interest in any kind of manipulation of the intermediate list, I use ToArray over ToList as there's less overhead.

I do this all the time, but I wonder if there is a better pattern for this kind of problem?

+3  A: 

There is a much better option: AsEnumerable

Usage is similar:

collectionofsomestuff //here it's LinqToEntities
    .Select(something=>new{something.Name,something.SomeGuid})
    .AsEnumerable() //From here on it's LinqToObjects
    .Select(s=>new SelectListItem()
        {
            Text = s.Name, 
            Value = s.SomeGuid.ToString(), 
            Selected = false
        })

This, however, doesn't force a full copy to be made like ToList() or ToArray(), and preserves any deferred execution from your provider.

Reed Copsey
Only better if you aren't doing significant work in the remainder of the Linq query. If you are doing significant work, the SqlConnection will remain open for the entirety of the operation. Calling ToArray() or ToList() completes the entire query and closes the SqlConnection at that point.
LorenVS
...I've seen it a million times and it barely registered. Nice one.
spender
So would the be some sort of buffering of the source stream in this case? Or would we literally be pulling results one at a time? (I'm pretty ignorant at this level)
spender
I'm going to write up another answer for this, not enough room here to explain...
LorenVS
+5  A: 

Reed's answer is indeed correct, if you are doing simple assignments in the remainder of the LINQ query. However, if you are doing significant work or computation in the LinqToObjects section of your query, his solution has some slight problem if you consider the connections to the underlying data source:

Consider:

collectionofsomestuff //here it's LinqToEntities
    .Select(something=>new{something.Name,something.SomeGuid})
    .AsEnumerable() //From here on it's LinqToObjects
    .Select(s=>new SelectListItem()
        {
            Text = s.Name, 
            Value = s.SomeGuid.ToString(), 
            OtherValue = someCrazyComputationOnS(s)
        })

If you can imagine for a second the code for the LinqToEntities select function (highly simplified, but you should get the picture), it might look something like:

using(SqlConnection con = createConnection())
{
    using(SqlCommand com = con.CreateCommand())
   {
       con.Open();
       com.CommandText = createQuery(expression);

       using(SqlDataReader reader = com.ExecuteReader())
       {
           while(reader.Read())
           {
               yield return createClrObjectFromReader(reader);
           }
       }
   }
}

This method supports the traditional Linq deferred execution patterns. This means that whenever a result is read from the reader, it will be "yielded" back to the caller, and the next value won't be read until the caller requests it.

So, in the above code, the sequence of execution for a result set of 5 records would be:

con.Open();

reader.Read();
createClrObjectFromReader(reader);
// at this point there is a yield back to the caller
someCrazyComputationOnS(s);


reader.Read();
createClrObjectFromReader(reader);
// at this point there is a yield back to the caller
someCrazyComputationOnS(s);


reader.Read();
createClrObjectFromReader(reader);
// at this point there is a yield back to the caller
someCrazyComputationOnS(s);


reader.Read();
createClrObjectFromReader(reader);
// at this point there is a yield back to the caller
someCrazyComputationOnS(s);


reader.Read();
createClrObjectFromReader(reader);
// at this point there is a yield back to the caller
someCrazyComputationOnS(s);

// ONLY here does the connection finally get closed:
con.Close();

Although this does preserve the deferred execution pattern. This is not optimal in this situation. Calling ToList() or ToArray() would cause the entire raw query results to be buffered into an Array or List, after which point the SqlConnection could be closed. Only after the SqlConnection had been closed would the calls to someCrazyComputationOnS(s) actually occur.

In most cases, this isn't a concern and Reed's answer is indeed correct, but in the rare case you are doing large amounts of work on your dataset, you definitely want to buffer the results before proceeding with large LinqToObjects queries.

LorenVS
really good point
andy
Thanks, deferred execution is still not fully understood in some places... I like to make the worst case point about deferred execution, just to make sure people keep it in mind when they are implementing solutions...
LorenVS
The flip side of this argument is that the deferred processing lets you chum through gigantic amounts of data without incurring the cost of loading it all. That can be *very* handy.
spender
+1: The flip side of this, though, is if you do something like Take(10), you're really hurting yourself...
Reed Copsey
Agreed, you have to realize which side of the .ToList() or .ToArray() that various things go on. Take(10) or similar should go before the ToList() or ToArray() call, and throwing them on the other side does create serious problems... Just another thing to think about
LorenVS