What are the fundamental misunderstandings people have when they first start using LINQ?
For instance, do they think it is one thing when it is really something else?
And, are there some best practices to employ to avoid these mistakes?
What are the fundamental misunderstandings people have when they first start using LINQ?
For instance, do they think it is one thing when it is really something else?
And, are there some best practices to employ to avoid these mistakes?
Possibly, one of the misconceptions people might have is that the way a LINQ query is written, especially LINQ2SQL, has no impact on performance. One should always know what goes on in the background, if one intends to write code that has high performance, otherwise you might end up with interesting timeouts, OOMexceptions, stack overflow and such... =)
Here is one, LINQ to SQL queries involving strings cause SQL Server procedure cache bloat People need to be aware of that
The biggest mistake people make when using LINQ is the same as when people try to use any sort of technology that lies on top of a technology that they don't have any good grounding in.
If you can't understand proper/efficient DB querying, you will screw up with LINQ.
If you can't understand the basic fundamentals of ADO.NET and data access, you'll probably screw up.
People think that by using LINQ it will allow them to coast by, but it won't.
Failing to understand the differences betweeen (or existence of!):
.First()
.FirstOrDefault()
.Single()
.SingleOrDefault()
Not understanding deferred execution.
I think understanding the point of the query execution is often a mistake (i.e. believing it's at the point of the query rather than at the point the data is first accessed), along with the belief that just because it compiles that it's going to run.
This in reference to Linq to SQL.
A fantastic tool for Linq is LinqPad by Joe Albahari, allowed me to learn Linq so much more quickly. If you don't have it, get it! And I'm not even on commission ;)
LINQ as a language is pretty straight forward and not so unexpected, especially if you're familiar with functional programming.
The concept of Deferred Execution is probably the biggest gotcha, and one of the best features. When you use LINQ that returns an IQueryable it's important to remember you are NOT executing whatever code you just wrote. It isn't until you call one of the methods that produces some other result that the query is executed.
Also, in terms of the LINQ to SQL provider, the biggest gotcha I've found is the performance cost. Turns out there is significant CPU cost to constructing SQL queries that are incurred every time the LINQ query is ran, unless you pre-compile your highly trafficked queries.
Somethings which come to mind are
One basic one that I see in LINQ to SQL is not understanding DataContext. It is a Unit of Work object and should be re-created for each unit of work.
I totally agree with Adam Robinson, in fact the BIG mistake is that people stops on the beauty syntax not going deeper in the tech-facts, in terms of impacts or architectural views.
Sometimes people think about it as one thing when it's really another thing.. about that it's important to note Linq is a "Technology" and could be implemented in many ways, each of them could impact in different way about performance and design (for example), the basic syntax remain the same but the underlying things could changes.
Actually, starting from the great and growing implementations, there's not a complete list of best practices, the best practices could begin from:
Using linq2sql on tables with no primary keys (and not defining one in the designer).
Specially if what they are doing is an update, it doesn't update anything and you get no error.
A lot of people think that LINQ is 'Magical SQL' they can use in Code. It looks like SQL, but it's quite different. Understanding that it's difference and what it's really doing will prevent a lot of frustration.
Speaking for myself, knowing when a sequence will be buffered or streamed is important.
Filling a buffer with large amounts of data will consume lots of memory. If possible, operations like reversing, counting, ordering, etc. should be done once the data has been reduced. In joins the left sequence is streamed and the right is buffered. When there's a significant difference in size, put the smaller one on the right.