I'm working on an application that allows dentists to capture information about certain clinical activities. While the application is not highly customizable (no custom workflows or forms) it does offer some rudimentary customization capabilities; clients can choose to augment the predefined form fields with their own custom ones. There are about half a dozen different field types that admins can create (i.e. Text, Date, Numeric, DropDown, etc). We're using Entity-Attribute-Value (EAV) on the persistence side to model this functionality.
One of the other key features of the application is the ability to create custom queries against these custom fields. This is accomplished via a UI in which any number of rules (Date <= (Now - 5 Days), Text Like '444', DropDown == 'ICU') can be created. All rules are AND'ed together to produce a query.
The current implementation (which I "inherited") is neither object oriented nor unit testable. Essentially, there is a single "God" class that compiles all the myriad rule types directly into a complex dynamic SQL statement (i.e. inner joins, outer joins, and subselects). This approach is troublesome for several reasons:
- Unit testing individual rules in isolation is nearly impossible
- That last point also means adding additional rule types in the future will most definitely violate the Open Closed Principle.
- Business logic and persistence concerns are being co-mingled.
- Slow running unit tests since a real database is required (SQLLite can't parse T-SQL and mocking out a parser would be uhh...hard)
I'm trying to come up with a replacement design that is flexible, maintainable and testable, while still keeping query performance fairly snappy. This last point is key since I imagine an OOAD based implementation will move at least some of the data filtering logic from the database server to the (.NET) application server.
I'm considering a combination of the Command and Chain-of-Responsibility patterns:
The Query class contains a collection of abstract Rule classes (DateRule, TextRule, etc). and holds a reference to a DataSet class that contains an unfiltered set of data. DataSet is modeled in a persistence agnostic fashion (i.e no references or hooks into database types)
Rule has a single Filter() method which takes in an DataSet, filters it appropriately, and then returns it to the caller. The Query class than simply iterates over each Rule, allowing each Rule to filter the DataSet as it sees fit. Execution would stop once all rules have been executed or once the DataSet has been filtered down to nothing.
The one thing that worries me about this approach are the performance implications of parsing a potentially large unfiltered data set in .NET. Surely there are some tried and true approaches to solving just this kind of problem that offer a good balance between maintainability and performance?
One final note: management won't allow the use of NHibernate. Linq to SQL might be possible, but I'm not sure how applicable that technology would be to the task at hand.
Many thanks and I look forward to everyone's feedback!
Update: Still looking for a solution on this.