tags:

views:

443

answers:

6

I am starting to explore LINQ to Objects but want to know how application design can best accommodate LINQ to Objects.

As an example .... an application displays female employees only and the employee table in the database contains data for both male and female employees.

By default, the application builds a SQL statement to retrieve only the female employees from the database and places them into an object (List) which is then passed around the application for display etc.

To get the greatest leverage from LINQ to objects, I assume you would first have to obtain all the data for employees and then apply LINQ to filter the object to only display the female employees. The benefit is that you can continue to query the employees without touching the database again.

This is a basic example but the point I am trying to make is where does the line between benefit and efficient execution time cross.

What if the employee table has 100,000 records?

I can see the benefit of LINQ .... but in my head, I am convinced that you first need to put ALL the data into an object from which you can then use LINQ to query.

Getting ALL the data may be too costly to make it worthwhile when I can just query the database again each time I need a different subset of data.

Thoughts?

+2  A: 

If you only need a subset of the data to start with, only fetch that subset. I don't think it makes sense to fetch all the employee data in your example.

However, that doesn't mean that LINQ to Objects is rarely useful - often when you've already got some data (which may very well not be from a database - I find it incredibly useful with reflection, for example) you want to slice and dice it several ways. LINQ to Objects is a very powerful tool for that.

Personally, I find LINQ to Objects the most compelling LINQ provider. It's easy to predict what it will do (because there's no translation involved) and it's useful in every layer of the application. I miss it dreadfully when I'm writing Java - pretty much any time I do anything interesting with a collection, LINQ to Objects makes it easier.

Jon Skeet
+1  A: 

LINQ is not only about having to do with databases.

In short, it gives you querying capabilities (with DB or without DB) to a structure (which could be rows, XML, in memory list of Objects etc). So that, you don't have to write code to do things manually & it is more readable

Imagine, having to compare two lists & trying to find common elements among it using C# code. Doing this using SQL will be easy to understand but doing the same thing in c# will require little more code (and it will not be readable unless you try making it so)

LINQ gives syntactic sugar which makes it look like you are writing SQL to query/sort/aggregate things. SQL is readable to most developers.

EDIT: Assume you have the subset of data that you wish to show to the users. Now, you want some kind of filter/aggregation/sorting operation so that you don't have to use the DB to do all that, how will you do it?

What if there was something which treats my collection as some kind of queryable,sortable, aggregated structure (similar to that of a table in SQL)?

shahkalpesh
+1  A: 

When you're dealing with data from a database, it starts to get fuzzy whether you should rely on LINQ-to-Objects or doing queries in the database. I think in general, if you have a database, it's always best to do as much filtering and sorting in the DB as possible, and use LINQ to Objects sparingly, especially on large collections of data.

However, since DAL code can get kind of clunky, I find that sometimes it's just easier to run a FindAll query in the DAL, and then just use LINQ to objects to sort and filter. (But only if your collection is small).

LINQ-to-Objects is useful for doing database-like sorting and filtering on in-memory collections. This might be a collection that you pulled from the database that you need to filter down a little more, but it could also be just any collection in code.

Andy White
+1  A: 

You can use LINQ for far more than just querying databases. Similar to the way LINQ to database entities/XML/etc benefit from deferred processing; and so do queries against arrays, collections, object graphs and almost any other memory structure you can think of. By deferred processing I mean; you can define the query (assinged to a variable) and it doesn't actually execute until you start enumerating the results. Also, the predicate logic for LINQ to whatever objects are chainable; think filtering pipelines.

For example, imagine you want to make a string extension method that strips out punctuation and whitespace. You could create a query against the string like this:

public static string RemovePunctuation(this string @string)
{
    if (string.IsNullOrEmpty(@string))
     return @string;

    var bytes = (
     from ch in @string
     where !Char.IsPunctuation(ch) && !Char.IsWhiteSpace(ch)
     select Convert.ToByte(ch)
     ).ToArray(); // <- calling to array enumerates the results

    return UnicodeEncoding.ASCII.GetString(bytes);
}

There are certainly other ways to do the same thing, but this is an interesting use of LINQ against a string, and it performs quite well.

Kris
+1  A: 

Philosophy of use? Here's a general answer you can apply elsewhere.

Read a few examples of where a tool is useful. If you then find yourself lacking a reason to use that tool, because similar examples never arise for you, then forget about it. It may just not be applicable to your domain.

If all the data you are manipulating is in the RDBMS, then maybe you don't need Linq to Objects at all.

On the other hand... it may be that you're thinking of it as a way of doing some extra manipulation on data from the database, and so missing opportunities to tighten up the expressiveness of your code that has nothing to do with the database.

Example: you're reading a file consisting of lines of plain text.

var lines = File.ReadAllLines(fileName);

As it happens, lines now contains an array of strings, but arrays support IEnumerable, so we can use Linq methods on them. Suppose you want to remove the lines with nothing in them:

var nonBlankLines = lines.Where(line => line.Trim() == string.Empty);

And suppose you wanted those strings in quotes (naive implementation - need to escape existing quotes!):

var quoted = lines.Where(line => line.Trim() == string.Empty)
                  .Select(line => "\"" + line + "\"");

(I like to line up successive operations with the dot-method aligned under each other.)

And unless I'm going to do something else with the lines, I'd do this:

var quoted = File.ReadAllLines(fileName)
                 .Where(line => line.Trim() == string.Empty)
                 .Select(line => "\"" + line + "\"");

And then suppose I want this all turned into a single string separated by commas, there's a method in string called Join that can do that, if I turn it all into an array first:

var quoted = string.Join(", ", 
                         File.ReadAllLines(fileName)
                             .Where(line => line.Trim() == string.Empty)
                             .Select(line => "\"" + line + "\"")
                             .ToArray());

Or we can use a Linqy way of doing it:

var quoted = File.ReadAllLines(fileName)
                 .Where(line => line.Trim() == string.Empty)
                 .Select(line => "\"" + line + "\"")
                 .Aggregate((a, b) => a + ", " + b);

Also it's no big deal to fill in a few blanks where you find there's no existing operator for what you need (although sometimes there turns out to already be one). One big one that is missing is an opposite to Aggregate, which I've taken to calling Util.Generate:

IEnumerable<T> Generate<T>(T item, Func<T, T> generator)
{
    for (; item != null; item = generator(item))
        yield return item;
}

This comes in very handy when you've got a linked list, of the kind that crop up occasionally in object models. An example is Exception.InnerException, which allows exceptions to form a linked list, with the innermost one at the end. Suppose we want to display the message from the innermost exception of x only:

MessageBox.Show(Util.Generate(x, i => i.InnerException).Last().Message);

The Generate helper method converts the linked list into an IEnumerable, allowing other Linq methods to work on it. It just needs to be given a lambda to tell it how to get to the next item from the current one.

Maybe this will get you started, or maybe you need more examples, or maybe you literally never manipulate any data that isn't sourced from the RDBMS.

Daniel Earwicker
+1  A: 

LinqToObjects is not about working with 100,000 elements in a non-indexed collection efficiently. Neither is a foreach loop.

Philosophically, LinqToObjects and foreach work in the same space. Examine the foreach loops in your code and see if they are more expressively written as LinqToObjects queries.

David B