views:

1370

answers:

3

Any advice on when DataTable.Select should be used versus LINQ Select when dealing with an in-memory DataTable?

I find LINQ syntax easier and more powerful, but I'm not sure if there are performance or other issues which make a DataTable select preferable.

(I'm using a third party API that provides a DataTable that has been pre-populated from the database. I need to filter that further in-memory.)

A: 

Hmm, are you comparing datasets with LINQ to SQL? If you are, then I'd say just ditch datasets and go with L2S. DataTable.Select assumes you've already populated the datatable with data. This can lead to bad designs where you load more data than you need, just to filter in the client. Get SQL Server to do your querying for you, and work on the resultset it gives you. L2S will read from the database only when you iterate the collection, so it's much easier to formulate your queries before hitting the database.

LINQ to SQL introduces some debugging overhead because it can be awkward to get the dynamically-generated SQL out of it (whereas in datasets you are supplying the SQL in the first place), but in almost all other situations it's far more elegant. The deferred loading functionality is especially useful.

If you're not working with a database, then I'd still prefer LINQ (specifically known as LINQ to Objects in this scenario) over datasets. The syntax is just much easier, and because there are no magic strings (i.e. SQL statements) then it's easier to test and you get compile-time warnings for typos etc.

Neil Barnwell
Thanks for your answer. I should have made clear that I don't have interaction with the database. I'll update the question.
Alex Angas
Ahh fair enough. I've updated my answer on that basis.
Neil Barnwell
+2  A: 

Without even mentioning LINQ, I would not use DataTable.Select anywhere unless I absolutely had to, since in most cases it means performing in the client something that should probably be performed in the database.

Update: my answer here is probably a bit overstated. There are sometimes legitimate reasons for using a DataTable as a (hopefully) small in-memory database that minimizes client-to-database round trips.

MusiGenesis
Absolutely - although this is an 'absolutely have to' case. Working with a third party API.
Alex Angas
+2  A: 

Based upon personal experience, I try to avoid the Datatable.Select. I find it to be slow and has some odd bugs.

One (confirmed and documented by Microsoft) bug I ran into was that DataTable.Select doesn't always evaluate AND conditions correctly when there are parenthesis in the statement.

For example, (Col1 > 1) AND (Col < 10) can fail to return correct answers, whereas Col1 > 1 AND Col < 10 will work correctly.

This bug doesn't show up on every computer. In my case the check I was using ran fine on my development platform and every client computer except one. After I discovered this bug I began shifting to using LINQ for selects and noticed a significant increase in the speed of the operations.

Side note: Without going into long explanations, my company doesn't use a database to store data. All of our operations with DataTables involve in memory tables loaded from flat-files. So I am not talking about LINQ 2 SQL, but LINQ to Dataset.

RB Davidson