For programmers that do not come from a functional programming background, are there an mistakes to avoid?
The biggest mistake people tend to make is to misunderstand the laziness and evaluation rules for a LINQ query:
Queries are lazy: they are not executed until you iterate over them:
// This does nothing! No query executed!
var matches = results.Where(i => i.Foo == 42);
// Iterating them will actually do the query.
foreach (var match in matches) { ... }
Also, results are not cached. They are computed each time you iterate over them:
var matches = results.Where(i => i.ExpensiveOperation() == true);
// This will perform ExpensiveOperation on each element.
foreach (var match in matches) { ... }
// This will perform ExpensiveOperation on each element again!
foreach (var match in matches) { ... }
Bottom line: know when your queries get executed.
IMO when you face LINQ, you must know these topics (they're big sources of errors):
For programmers that do not come from a functional programming background, are there an mistakes to avoid?
Good question. As Judah points out, the biggest one is that a query expression constructs a query, it does not execute the query that it constructs.
An immediate consequence of this fact is executing the same query twice can return different results.
An immediate consequence of that fact is executing a query the second time does not re-use the results of the previous execution, because the new results might be different.
Another important fact is queries are best at asking questions, not changing state. Try to avoid any query that directly or indirectly causes something to change its value. For example, a lot of people try to do something like:
int number;
from s in strings let b = Int32.TryParse(s, out number) blah blah blah
That is just asking for a world of pain because TryParse mutates the value of a variable that is outside the query.
In that specific case you'd be better to do
int? MyParse(string s)
{
int result;
return Int32.TryParse(s, out result) ? (int?)result : (int?)null;
}
...
from s in strings let number = MyParse(s) where number != null blah blah blah...
Understanding the semantics of closures.
While this isn't a problem limited to just LINQ queries, closed over variables do tend to come up more frequently in LINQ because it's one of the most common places where lambda expressions are used.
While closures are very useful they can also be confusing and result in subtly incorrect code. The fact that closures are "live" (meaning that changes to variables outside of the captured expression are visible to the expression) is also unexpected to some developers.
Here's an example of where closures create problems for LINQ queries. Here, the use of closures and deferred execution combine to create incorrect results:
// set the customer ID and define the first query
int customerID = 12345;
var productsForFirstCustomer = from o in Orders
where o.CustomerID = customerID
select o.ProductID;
// change customer ID and compose another query...
customerID = 56789; // change the customer ID...
var productsForSecondCustomer = from o in Orders
where o.CustomerID = customerID
select o.ProductID;
if( productsForFirstCustomer.Any( productsForSecondCustomer ) )
{
// ... this code will always execute due to the error above ...
}
This query will always enter the body of the if() { }
statement because by changing the customerID
value it affects the execution of both queries - they, in fact, both use the same ID since the customerID
variable is captured in both LINQ statements.