views:

62

answers:

3

Take the following pseudo C# code:

using System;
using System.Data;
using System.Linq;
using System.Collections.Generic;

public IEnumerable<IDataRecord> GetRecords(string sql)
{
     // DB logic goes here
}

public IEnumerable<IEmployer> Employers()
{
     string sql = "select EmployerID from employer";
     var ids = GetRecords(sql).Select(record => (record["EmployerID"] as int?) ?? 0);
     return ids.Select(employerID => new Employer(employerID) as IEmployer);
}

Would it be faster if the two Select() calls were combined? Is there an extra iteration in the code above? Is the following code faster?

public IEnumerable<IEmployer> Employers()
{
     string sql = "select EmployerID from employer";
     return GetRecords(sql).Select(record => new Employer((record["EmployerID"] as int?) ?? 0) as IEmployer);
}

I think the first example is more readable if there is no difference in performance.

A: 

There is no difference.

LINQ uses the idea of delayed evaluation, source. I'll quote the relevant part:

To get around this, all built-in LINQ providers utilize a concept known as delayed execution. Rather than having query operators execute immediately, they all simply return types that implement the IEnumerable(of T) interface. These types then delay execution until the query is actually used in a for each loop.

Basically, until you use the result of Employers() in a foreach or .ToList() etc, it hasn't actually done any work.

Alastair Pitts
+2  A: 

There is no significant difference. Both methods returns an expression that can loop through the result from GetRecords.

They are not identical, as the first one has chained Selects, but they will still do the same work and in the same order. When looping the chained Selects, the second select will put items one and one from the first Select as needed, the first Select doesn't have to complete before the second Select can use the result.

Guffa
It sounds like with the chained Selects I have two little state machines with one passing off to the other. This is not going to add any significant memory overhead versus the all-in-one?
Justin
@Justin: Yes, they each have their own enumerator (or similar state) so there is a small overhead in processing, but the items are still processed one and one all the way from the GetRecords result, so there is no build-up in memory. The extra overhead is negligable compared to the time it takes to get the data from the database.
Guffa
A: 

The second snippet is faster (in theory), because with the first an extra iterator class is created and used. Iterators gives a bit of overhead. However, the overhead is very small and is totally lost when dealing with databases, because database calls are orders of magnitudes slower than the overhead of the extra iterator.

Steven