views:

149

answers:

4

I'm writing a bit of LINQ to SQL that needs to perform a search for client information in our database, filtering the results and paging them. The filtering needs to be dynamic so I have broken the setup of the query into four stages:

  1. Find visible clients (basically apply a coarse grained filter)
  2. Filter clients by search criteria
  3. Sort and page the clients
  4. Retrieve additional client data

The code goes along the lines of:

// Step 1
var visibleClientQuery = from x in xs
                         from y in ys
                         from z in yas
                         where
                             ...
                         select new
                         {
                             id = z.Client.Id,
                             a = z.Client.a,
                             b = z.Client.e.f.g.b
                         };

// Step 2
if (filterByA)
{
    visibleClientQuery = visibleClientQuery.Where(client => client.a > 10);
}
else
{
    visibleClientQuery = visibleClientQuery.Where(client => client.b.StartsWith("ABC"));
}

// Step 3
visibleClientQuery = visibleClientQuery.Distinct();

if (filterByA)
{
    visibleClientQuery = visibleClientQuery.OrderBy(client => client.a);
}
else
{
    visibleClientQuery = visibleClientQuery.OrderBy(client => client.b);
}

visibleClientQuery = visibleClientQuery.Skip(50).Take(30);

// Step 4
var fullQuery = from filterClientDetail in filteredClientQuery
                 join client in Clients on filterClientDetail.Id equals client.Id
                 ...;

LINQ to SQL does a great job of combining these elements together to produce an efficient query. But one thing I want to do is reduce the number of joins created in the first, coarse query. If I'm filtering and sorting by A, there is no need for the first query to populate b using the line:

b = z.Client.e.f.g.b

Not populating b would save all that extra joining. I tried replacing the line with:

b = filterByA ? string.Empty : z.Client.e.f.g.b

Whilst this functionally works, when filterByA is true, the excess joins are still present in the query... slowing it down. My workaround is to introduce another interim query that wraps the Step 1 query:

// Step 1
var visibleClientQuery = from x in xs
                         from y in ys
                         from z in yas
                         where
                             ...
                         select z.Client;

// Step 2
var filteredClientQuery = from client in visibleClientQuery
                          select new
                          {
                              id = client.Id,
                              a = client.a,
                              b = string.Empty
                          };

Then if we need to filter by B, this is replaced with:

filteredClientQuery = from client in visibleClientQuery
                      select new
                      {
                          id = client.Id,
                          a = 0,
                          b = client.e.f.g.b
                      };

So long as the replacement query returns an anonymous class with the same properties, this works, but seems like an unnecessary, heavyweight hack and doesn't allow easy mixing and matching of filters... what if we need to filter by A and B? The real query has several more possible filters.

Is there any way to programtically change how an individual property is populated within the anonymous class returned from a query... in much the same way that the where clause can be changed? Using filteredClientQuery.Select(...) I can swap out the entire select, but I can't see a way to work on an individual property.

Any help appreciated... even if it is to just confirm there is no solution!

A: 

If you can accept the lack of type safety, you can try Dynamic Linq, as described by Scott Guthrie, where strings are used to represent the different parts of the query.

An example (given in the download pointed to by Scott) is:

   var query =
        db.Customers.Where("City == @0 and Orders.Count >= @1", "London", 10).
        OrderBy("CompanyName").
        Select("New(CompanyName as Name, Phone)");

It is somehow a step backwards to use strings in a query, but in the special case of a dynamic query, it is easier to construct the various pieces (Where, Select) through strings, adding the needed filter or selected column as you analyze what the user requested.

Timores
I'll look at it but I agree it seems to be a step backwards... as well as the intellisense and type safety, one reason I wanted to use LINQ to SQL was to help minimise the attack surface for SQL injection attacks, something that string concatenation is notorious for.
Piers Lawson
You're safe on that side, see the first comments in the post.
Timores
That is good to know, but I'd still worry about some future developer making a change and simply concatenating strings... but code reviews should catch that ;-). However, I'd still like to find a programatic approach if possible!
Piers Lawson
You ruled out the call to Select in your description, but this is a - admittedly complex - way to go to dynamically generate the query, while keeping type safety.Another way is building expression trees, but this is even more difficult.
Timores
Yes, calling Select (AFAIK) only allows me to replace the entire select, so doesn't improve matters. What I want to do is change how the properties within an existing Select are populated. So I would set up a basic query with a Select that returns a type of the "correct" shape, then subsequent code would alter how those properties are populated (on a property by property basis). Possibly the initial query would be “fully populated” and subsequent code could clear some properties.
Piers Lawson
Alternatively, is there a way to set up the Select such that the population of a property chooses between two anonymous delegates at construction time? Rather than what I suspect is happening above, whereby the b = filterByA ? string.Empty : z.Client.e.f.g.b becomes a single delegate that is used within the expression tree as is? Can I define the delegate external to the query and still reference values within the query? If so I could supply different delegates.
Piers Lawson
Dynamic Linq sucks. You might as well go back to string concatenation and dc.ExecuteQuery.
David B
Thanks for downvoting a correct answer to a difficult question! Do you have anything constructive to add ?
Timores
Why downvote? This is an acceptable solution.
Richard Hein
This isn't a correct answer to a difficult question. All you did was link Scott Guthrie. The OP wants to use linq, presumably for compile time guarantees and you flush the whole promise by resorting to compiler-opaque strings.
David B
I'm downvoting too. There seem to be a number of people who respond with "Use DLINQ/DynamicQuery" every time someone asks a query about putting together a query in multiple steps. The two have **nothing** to do with each other. DLINQ only applies when you don't know *field names* ahead of time, which isn't the case here at all.
Aaronaught
@David B, I have to apologize about the last part of my previous comment, since I hadn't seen that you had indeed answered the question (although the OP had ruled out using the Select method, which is my preferred answer). I still think that dynamic Linq can be considered for, well, dynamic queries, since using classic Linq is hard in these cases. I did mention the disadvantages, though.
Timores
+1  A: 

You should write your query against the mapping class as much as possible. Surrender the control.

Edit: Do not let anonymous type declarations constrain your thinking. Feel free to declare a custom type to hold the custom result. After all, an anonymous type is just letting the compiler create/name the type for you.

public class SubClient
{
   public int id {get;set;}
   public int? a {get;set;}
   public string b {get;set;}
}


// Step 1 
IQueryable<Client> visibleClientQuery =
    from x in xs 
    from y in ys 
    from z in yas 
    where 
    ... 
    select z.Client;

IQueryable<SubClient> subClientQuery = null;

// Step 2 
if (filterByA) 
{ 
    subClientQuery = visibleClientQuery.Select(c => new SubClient()
    {
        id = c.id,
        a = c.a
    }).Where(x => x.a > 10); 
} 
else 
{ 
    subClientQuery = visibleClientQuery.Select(c => new SubClient()
    {
      id = c.id,
      b = c.e.f.g.b
    })
    .Where(x => x.b.StartsWith("ABC")); 
}

// Step 3 
subClientQuery = subClientQuery.Distinct(); 

if (filterByA) 
{ 
    subClientQuery = subClientQuery.OrderBy(c => c.a); 
} 
else 
{ 
    subClientQuery = subClientQuery.OrderBy(c => c.b); 
} 

subClientQuery = subClientQuery.Skip(50).Take(30); 

Edit: Bonus in response to comment.

If you want a Func<Client, string>, you get it this way:

Func<Client, string> func = c => c.e.f.g.b;

If you want an Expression, do it this way (compiler takes care of it).

Expression<Func<Client, string>> expr = c => c.e.f.g.b;

I actually had this line in my code until I realized it would be difficult to use it in the Where call. Maybe you can figure it out.


Expression<Func<Client, SubClient>> selectExpr =
  GetSelectExpressionForFilterCriteria(filterByA, filterByB,
    filterByC, filterByD, filterByE);

Expression<Func<SubClient, bool>> filterExpr =
  GetFilterExpressionForFilterCriteria(filterByA, filterByB,
    filterByC, filterByD, filterByE);

subClientQuery = clientQuery.Select(selectExpr).Where(filterExpr);
David B
Yes, I looked at that approach, but it produced two problems for me 1) The class that would have to be used is Client as I need to use Distinct against it in Step 2 (multiple z's could link to the same client). But Client is quite a wide table, so the SQL produced has a very large DISTINCT clause (when actually all I need is the few fields that I can filter against plus the client's id); 2) Client has an XML column which cannot be part of the Distinct (it throws a SqlException). That is why the inner most query uses a projection rather than return a straight table. Thanks for your interest.
Piers Lawson
Ah, interesting. Editted for those considerations. I think the big problem is that the anonymous type prevents you from declaring the Query variable (I named it subClientQuery) until you are ready to use it.
David B
Piers Lawson
I was looking at using Func and Expression. e.g. Func<Client, string> selector = client => client.e.f.g.b ... b = selectorWhich compiles but throws the "has no supported translation to SQL" exception on execution. Reading around I believe I need something based on Expression<Func<>> but how I assign this to b I can't figure out. It is the sort of thing LINQKit might solve.
Piers Lawson
Q: "How can I write good code when all I have are conditionals?" -A: "Hide them, behind this."
David B
Using the Expression in a where clause is easy enough... you just need the right signature. Expression<Func<Client, bool>> expr = c => c.e.f.g.b.StartsWith("ABC"); Note the change of string to bool in the Func<>. You can now pass expr into Where(expr). But how to pass an Expression into a selector as part of the select... now that is something I cannot figure out!
Piers Lawson
Replacing the select statement is possible and I have tried that approach... but we are basically back where we started, how to programmatically build the select. IE the tough nut still exists, how do you implement GetSelectExpressionForFilterCriteria?
Piers Lawson
A: 

how do you implement GetSelectExpressionForFilterCriteria?

Here's what I came up with for implementing After reading these articles.

http://www.codewrecks.com/blog/index.php/2009/08/12/manipulate-expression-tree-in-dtogenerator/

http://msdn.microsoft.com/en-us/library/bb344570.aspx

This is completely freehand and untested. I wouldn't even post it if it wasn't so easy to verify Expression construction by ToString();

using System.Linq.Expressions;
using System.Reflection;


NewExpression newSubClient = Expression.New(typeof(SubClient));
Console.WriteLine(newSubClient.ToString());

List<MemberBinding> bindings = new List<MemberBinding>();
ParameterExpression paramExpr = Expression.Parameter(typeof(Client), "c");

MemberInfo idMember = typeof(SubClient).GetMember("Id")[0];
MemberBinding idBinding = Expression.Bind(
  idMember,
  Expression.Property(paramExpr, "id")
);
Console.WriteLine(idBinding.ToString());

//save it for later
bindings.Add(idBinding);

if (filterByA)
{
  MemberInfo aMember = typeof(SubClient).GetMember("a")[0];  
  MemberBinding aBinding = Expression.Bind(
    aMember,
    Expression.Property(paramExpr, "a")
  );
  Console.WriteLine(aBinding.ToString());

  //save it for later
  bindings.Add(aBinding);
}
if (filterByB)
{
  MemberInfo bMember = typeof(SubClient).GetMember("b")[0];
  MemberBinding bBinding = Expression.Bind(
    aMember,
    Expression.Property(
      Expression.Property(
        Expression.Property(
          Expression.Property(paramExpr, "e")
        , "f")
      , "g")
    , "b")
  );
  Console.WriteLine(bBinding.ToString());

  //save it for later
  bindings.Add(bBinding);
}

MemberInitExpression newWithInit = Expression.MemberInit
(
  newSubClient,
  bindings  //use any bindings that we created.
);
Console.WriteLine(newWithInit.ToString());

Expression<Func<Client, SubClient>> result =
  Expression.Lambda<Func<Client, SubClient>>
  (
    newWithInit,
    paramExpr
  );

Console.WriteLine(result);
return result;
David B
This is the part where Timores can downvote me fairly for resorting to opaque strings.
David B
A: 

So after rejecting the use of strings and deciding that manually building the expression was going to be way to complex (thanks David B for your efforts there) I decided to give LinqKit a go. This is based on some cool code written by Tomas Petricek. I had a couple of false starts but within 30 minutes it was all working exactly as I hoped. I now have code approximately like this:

// Step 1
var visibleZQuery = from x in xs.AsExpandable()
                         from y in ys
                         from z in yas
                         where
                             ...
                         select Z;

// Step 2
if (filterByA)
{
    visibleZQuery = visibleZQuery.Where(client => client.a > 10);
}

if (filterByB)
{
    visibleZQuery = visibleZQuery.Where(client => client.b.StartsWith("ABC"));
}

Expression<Func<Z, string>> aSelector = z => string.Empty;
Expression<Func<Z, string>> bSelector = z => string.Empty;

if (filterByA)
{
    aSelector = z => z.Client.a;
}

if (filterByB)
{
    bSelector = z => z.Client.e.f.g.b;
}

var filteredClientQuery = from z in visibleZQuery
                          select new 
                          { 
                              id = z.Client.Id, 
                              a = aSelector.Invoke(z), 
                              b = aSelector.Invoke(z)
                          }; 

// Step 3
filteredClientQuery = filteredClientQuery.Distinct();

if (filterByA)
{
    filteredClientQuery = filteredClientQuery.OrderBy(client => client.a);
}
else if (filterByB)
{
    filteredClientQuery = filteredClientQuery.OrderBy(client => client.b);
}

filteredClientQuery = filteredClientQuery.Skip(50).Take(30);

// Step 4
var fullQuery = from filterClientSummary in filteredClientQuery
                 join client in Clients on filterClientSummary.Id equals client.Id
                 ...;

There are a few changes to the code but the most significant is that the first query uses AsExpandable() to introduce the LinqKit wrapper and the select statement in Step 2 is populated with expressions that are defined outside of the construction of the main query. Everything is type safe and, apart from the AsExpandable(), the rest of the code works much as a developer who is used to Linq to SQL would expect.

I confirmed that this approach still appears to create efficient SQL that now only joins to tables if they are actually required.

Thanks to Timores and David B for your input!

Piers Lawson