tags:

views:

650

answers:

8
+9  Q: 

Addicted to LINQ

Ok, the more I use LINQ, the more I like it! I recently found myself working in some legacy code at work. It is your classic DataSet and DataTable rich application. Well, when adding a bit of functionality I found myself really wanting to just query the rows of a DataTable for the results I was looking for.

Let me repeat that... instead of looping and adding to a temp collection, I just wanted to ask the Rows collection for what I needed. No looping, no temp variables, just give me what I want.

var customerOrderIds = table.Rows.Cast<DataRow>()
   .Where(x => (string)x["CUSTOMER_ID"] == customerId)
   .Select(x => (string)x["CUSTOMER_ORDER_ID"])
   .Distinct();

My question is whether or not this is a good thing, or am getting carried away with LINQ? It does seem to me that this declarative style of pulling a subset of data out of a collection is a good thing and more readable in the end. But then again, maybe I'm just smitten :)

+5  A: 

Seems good to me - although I'd try to use a strongly typed data set which makes the LINQ queries look even more pleasant.

But yes, LINQ is a very good thing - and LINQ to Objects (and the surrounding technologies for XML and DataSets) is fabulously predictable compared to the out-of-process LINQ providers. (It's less sexy than LINQ to SQL, but more widely applicable IMO.)

Jon Skeet
A: 

Personally since the data table doesn't have the ability to do a select distinct on its own, I'll say that it isn't all that bad.

I would potentially ask though if there was any way to eventually get to using objects rather than data tables, as I think it would be easier for future developers to understand.

Mitchel Sellers
A: 

You're not getting carried away at all. There are actual works published on LINQ to DataSets. Having such clear, declarative object queries makes for much easier code maintainability. But you have to remember at the time you're filtering the data all of it has already been pulled back. You may want to consider adding the filtering to the SQL for the DataSet query.

Chad Moran
+9  A: 

One other observation; if you aren't using typed datasets, you might also want to know about the Field<> extension method:

    var customerOrderIds = table.Rows.Cast<DataRow>()
       .Where(x => x.Field<string>("CUSTOMER_ID") == customerId)
       .Select(x => x.Field<string>("CUSTOMER_ORDER_ID"))
       .Distinct();

Or using the query syntax:

   var customerOrderIds = (
        from row in table.Rows.Cast<DataRow>()
        where row.Field<string>("CUSTOMER_ID") == customerId
        select row.Field<string>("CUSTOMER_ORDER_ID")
     ).Distinct();

I'm not saying it is better or worse - just another viable option.

(Actually, I don't use DataTable very much, so YMMV)

Marc Gravell
I didn't know about Field<>... sweet - you just added to my addiction I think
Rob
(sfx: frantically checks book to see whether I knew about that at one point...) Phew... it's amazing what one forgets in the course of a year :)
Jon Skeet
A: 

LINQ is simply writing the "looping/temp variable" code for you. LINQ helps you to write code faster (and more readable).

You're code is good.

Timothy Khouri
+1  A: 

I too am smitten with LINQ. I have convinced management to let me build the next version of our main product using as much LINQ as I see fit. Good Times...

Sean
Good times and good luck!
Rob
Should this "Answer" actually be a comment?
kzh
+3  A: 

The query looks fine.

I'd like to point out two small things.

No looping

System.Linq.Enumerable methods operate against the IEnumerable(T) contract, which almost always means looping - O(N) solutions. Two implications of this:

  • Prefer Any() over Count() > 0 . Any() is O(1). Count() is O(N).
  • Join... all joins are nested loop O(M*N).

.Cast

.Cast works great for DataTable.Rows (all those objects -are- rows, so cast always succeeds). For heterogeneous collections, be aware of .OfType() - which filters out any items that cannot be casted.

Lastly, be aware that queries are not executed until they are enumerated! You can force enumeration by foreach, ToList, ToArray, First, Single, and many more.

David B
You are absolutely correct about the looping... I guess I'm most interested in the fact that I'm not the one doing the looping. If there aren't performance implications, I find the readability and thus maintainability to be much higher. Excellent tidbit about using Any()!
Rob
Actually, Count() will check the source for ICollection<T> and use the .Count - however, this won't work for anything with an iterator block... so source.Where(...).Count() might be expensive, but source.Count() might be cheap. But best practice is indeed Any()
Marc Gravell
Thanks Marc. Good to learn.
David B
A: 

A join (using the join keyword, but not the from keyword) uses a Dictionary for the matches and is thus O(M+N).

So is a group by, but not the following:

from x in Xs from y in Ys .Where(o => o == x) select new { x, y }

which is O(M*N).