tags:

views:

112

answers:

4

For programmers that do not come from a functional programming background, are there an mistakes to avoid?

+6  A: 

The biggest mistake people tend to make is to misunderstand the laziness and evaluation rules for a LINQ query:

Queries are lazy: they are not executed until you iterate over them:

// This does nothing! No query executed!
var matches = results.Where(i => i.Foo == 42);

// Iterating them will actually do the query.
foreach (var match in matches) { ... }

Also, results are not cached. They are computed each time you iterate over them:

var matches = results.Where(i => i.ExpensiveOperation() == true);

// This will perform ExpensiveOperation on each element.
foreach (var match in matches) { ... }

// This will perform ExpensiveOperation on each element again!
foreach (var match in matches) { ... }

Bottom line: know when your queries get executed.

Judah Himango
I think there's a Entity level cache (correct me if i'm wrong), if you request an entity twice (with the same id), the cached entity will be returned the second time. For the where request, I'm not sure the cache is used because it has to check the DB anyway cause the query is not cached.
Guillaume86
Entity Framework may have its own cache, but Entity Framework is just one LINQ provider. LINQ is just language-integrated-query, it does not care **what** you're querying. (It could just be an array, for example.) Individual query providers may have their own gotchas to be aware of, on top of anything LINQ has.
Judah Himango
+2  A: 

IMO when you face LINQ, you must know these topics (they're big sources of errors):

Deferred Execution (on SO)

Closure (on SO - 1)

Closure (on SO - 2)

Closure (Eric Lippert's Blog)

digEmAll
+2  A: 

For programmers that do not come from a functional programming background, are there an mistakes to avoid?

Good question. As Judah points out, the biggest one is that a query expression constructs a query, it does not execute the query that it constructs.

An immediate consequence of this fact is executing the same query twice can return different results.

An immediate consequence of that fact is executing a query the second time does not re-use the results of the previous execution, because the new results might be different.

Another important fact is queries are best at asking questions, not changing state. Try to avoid any query that directly or indirectly causes something to change its value. For example, a lot of people try to do something like:

int number; 
from s in strings let b = Int32.TryParse(s, out number) blah blah blah

That is just asking for a world of pain because TryParse mutates the value of a variable that is outside the query.

In that specific case you'd be better to do

int? MyParse(string s) 
{ 
    int result;
    return Int32.TryParse(s, out result) ? (int?)result : (int?)null;
}
...
from s in strings let number = MyParse(s) where number != null blah blah blah...
Eric Lippert
+1  A: 

Understanding the semantics of closures.

While this isn't a problem limited to just LINQ queries, closed over variables do tend to come up more frequently in LINQ because it's one of the most common places where lambda expressions are used.

While closures are very useful they can also be confusing and result in subtly incorrect code. The fact that closures are "live" (meaning that changes to variables outside of the captured expression are visible to the expression) is also unexpected to some developers.

Here's an example of where closures create problems for LINQ queries. Here, the use of closures and deferred execution combine to create incorrect results:

// set the customer ID and define the first query
int customerID = 12345;
var productsForFirstCustomer = from o in Orders
                               where o.CustomerID = customerID
                               select o.ProductID;

// change customer ID and compose another query...
customerID = 56789; // change the customer ID...
var productsForSecondCustomer = from o in Orders
                                where o.CustomerID = customerID
                                select o.ProductID;

if( productsForFirstCustomer.Any( productsForSecondCustomer ) )
{
   // ... this code will always execute due to the error above ...
}

This query will always enter the body of the if() { } statement because by changing the customerID value it affects the execution of both queries - they, in fact, both use the same ID since the customerID variable is captured in both LINQ statements.

LBushkin