views:

125

answers:

2

I'm working on an application that allows dentists to capture information about certain clinical activities. While the application is not highly customizable (no custom workflows or forms) it does offer some rudimentary customization capabilities; clients can choose to augment the predefined form fields with their own custom ones. There are about half a dozen different field types that admins can create (i.e. Text, Date, Numeric, DropDown, etc). We're using Entity-Attribute-Value (EAV) on the persistence side to model this functionality.

One of the other key features of the application is the ability to create custom queries against these custom fields. This is accomplished via a UI in which any number of rules (Date <= (Now - 5 Days), Text Like '444', DropDown == 'ICU') can be created. All rules are AND'ed together to produce a query.

The current implementation (which I "inherited") is neither object oriented nor unit testable. Essentially, there is a single "God" class that compiles all the myriad rule types directly into a complex dynamic SQL statement (i.e. inner joins, outer joins, and subselects). This approach is troublesome for several reasons:

  • Unit testing individual rules in isolation is nearly impossible
  • That last point also means adding additional rule types in the future will most definitely violate the Open Closed Principle.
  • Business logic and persistence concerns are being co-mingled.
  • Slow running unit tests since a real database is required (SQLLite can't parse T-SQL and mocking out a parser would be uhh...hard)

I'm trying to come up with a replacement design that is flexible, maintainable and testable, while still keeping query performance fairly snappy. This last point is key since I imagine an OOAD based implementation will move at least some of the data filtering logic from the database server to the (.NET) application server.

I'm considering a combination of the Command and Chain-of-Responsibility patterns:

The Query class contains a collection of abstract Rule classes (DateRule, TextRule, etc). and holds a reference to a DataSet class that contains an unfiltered set of data. DataSet is modeled in a persistence agnostic fashion (i.e no references or hooks into database types)

Rule has a single Filter() method which takes in an DataSet, filters it appropriately, and then returns it to the caller. The Query class than simply iterates over each Rule, allowing each Rule to filter the DataSet as it sees fit. Execution would stop once all rules have been executed or once the DataSet has been filtered down to nothing.

The one thing that worries me about this approach are the performance implications of parsing a potentially large unfiltered data set in .NET. Surely there are some tried and true approaches to solving just this kind of problem that offer a good balance between maintainability and performance?

One final note: management won't allow the use of NHibernate. Linq to SQL might be possible, but I'm not sure how applicable that technology would be to the task at hand.

Many thanks and I look forward to everyone's feedback!

Update: Still looking for a solution on this.

+1  A: 

I think that LINQ to SQL would be an ideal solution coupled, perhaps, with Dynamic LINQ from the VS2008 samples. Using LINQ, particularly with extension methods on IEnumerable/IQueryable, you can build up your queries using your standard and custom logic depending on the inputs that you get. I use this technique heavily to implement filters on many of my MVC actions to great effect. Since it actually builds an expression tree then uses it to generate the SQL at the point where the query needs to be materialized, I think it would be ideal for your scenario since most of the heavy lifting is still done by the SQL server. In cases where LINQ proves to generate non-optimal queries you can always use table-valued functions or stored procedures added to your LINQ data context as methods to take advantage of optimized queries.

Updated: You might also try using PredicateBuilder from C# 3.0 in a Nutshell.

Example: find all Books where the Title contains one of a set of search terms and the publisher is O'Reilly.

 var predicate = PredicateBuilder.True<Book>();
 predicate = predicate.And( b => b.Publisher == "O'Reilly" );
 var titlePredicate = PredicateBuilder.False<Book>();
 foreach (var term in searchTerms)
 {
     titlePredicate = titlePredicate.Or( b => b.Title.Contains( term ) );
 }
 predicate = predicate.And( titlePredicate );

 var books = dc.Book.Where( predicate );
tvanfosson
hee hee - might as well ask him "have you tried javascript?"
Steven A. Lowe
Tim,Thanks for the reply. I don't suppose you have any sample code I could study (other than the Dynamic LINQ samples you referred to) ?
Dirk
A: 

The way I've seen it done is by creating objects that model each of the conditions you want the user to build their query from, and build up a tree of objects using those.

From the tree of objects you should be able to recursively build up an SQL statement that satisfies the query.

The basic ones you'll need will be AND and OR objects, as well as objects to model comparison, like EQUALS, LESSTHAN etc. You'll probably want to use an interface for these objects to make chaining them together in different ways easier.

A trivial example:

public interface IQueryItem
{
    public String GenerateSQL();
}


public class AndQueryItem : IQueryItem
{
    private IQueryItem _FirstItem;
    private IQueryItem _SecondItem;

    // Properties and the like

    public String GenerateSQL()
    {
        StringBuilder builder = new StringBuilder();
        builder.Append(_FirstItem.GenerateSQL());
        builder.Append(" AND ");
        builder.Append(_SecondItem.GenerateSQL());

        return builder.ToString();
    }
}

Implementing it this way should allow you to Unit Test the rules pretty easily.

On the negative side, this solution still leaves the database to do a lot of the work, which it sounds like you don't really want to do.

Benny Hallett
This is definitely a workable solution, but I was hoping to engineer a solution that avoids the trappings associated with spooling out dynamic SQL (unit-testability, testing rules in isolation, etc).
Dirk