Scaling a rich domain model

views:

506

answers:

+8 Q:

Scaling a rich domain model

Domain Driven Design encourages you to use a rich domain model. This means all the domain logic is located in the domain model, and that the domain model is supreme. Persistence becomes an external concern, as the domain model itself ideally knows nothing of persistence (e.g. the database).

I've been using this in practice on a medium-size one-man project (>100k lines of Java) and I'm discovering many advantages, mostly the flexibility and refactorability that this offers over a database-oriented approach. I can add and remove domain classes, hit a few buttons and an entire new database schema and SQL layer rolls out.

However, I often face issues where I'm finding it difficult to reconcile the rich domain logic with the fact that there's an SQL database backing the application. In general, this results in the typical "1+N queries problem", where you fetch N objects, and then execute a nontrivial method on each object that again triggers queries. Optimizing this by hand allows you to do the process in a constant number of SQL queries.

In my design I allow for a system to plug these optimized versions in. I do this by moving the code into a "query module" which contains dozens of domain-specific queries (e.g. getActiveUsers), of which I have both in-memory (naive and not scalable) and SQL-based (for deployment use) implementations. This allows me to optimize the hotspots, but there are two main disadvantages:

I'm effectively moving some of my domain logic to places where it doesn't really belong, and in fact even pushing it into SQL statements.
The process requires me to peruse query logs to find out where the hotspots are, after which I have to refactor the code, reducing its level abstraction by lowering it into queries.

Is there a better, cleaner way to reconcile Domain-Driven-Design and its Rich Domain Model with the fact that you can't have all your entities in memory and are therefore confined to a database backend?

No, not really. Not that I'm aware of anyway (though I'm interested to hear any of DDD's proponents' responses to the contrary).

In my own experience, and that of the very experienced team that I work with, if you want optimal performance from a database-backed application the transformation of its architecture to be service-oriented is inevitable. I wrote more about this here (the article talks about lazy loaded properties, but you could consider the point to apply to any method on the class that needs to retrieve more data to do its job).

So much as you're doing now, you could start with a rich domain model and transform it to be service-oriented where necessary for performance reasons. As long as you have defined performance goals and you're meeting them, there's no need to transform everything. I think it's a pretty decent pragmatic approach.

Greg Beech 2008-12-18 16:54:33

I guess a real fundamental improvement approach would require a different language paradigm, where you describe parts of your logic declaratively so that it can automatically be compiled into queries... but this kind of story has its own share of problems :-)

Wouter Lievens 2008-12-19 09:15:23

+3 A:

There are at least two ways to look at this problem, one is a technical "what can I do to load my data smarter" version. The only really smart thing I know about is dynamic collections that are partially loaded with the rest loaded on-demand, with possible preload of parts. There was an interesting talk at JavaZone 2008 about this

The second approach has been more of my focus in the time I've been working with DDD; how can I make my model so that it's more "loadable" without sacrificing too much of the DDD goodness. My hypothesis over the years has always been that a lot of DDD models model domain concepts that are actually the sum of all allowable domain states , across all business processes and the different states that occur in each business process over time. It is my belief that a lot of these loading problems get very reduced if the domain models are normalized slightly more in terms with the processes/states. This usually means there is no "Order" object because an ordrer typically exists in multiple distinct states that have fairly different semantics attached (ShoppingCartOrder, ShippedOrder, InvoicedOrder, HistoricalOrder). If you try to encapsulate this is a single Order object, you invariably end up with a lot of loading/construction problems.

But there's no silver bullet here..

krosenvold 2008-12-18 17:15:32

The problems you describe pertain mostly to inheritance, proxying and lazy state issues (in my context at least) and I've pretty much "solved" them for my project. Thanks for the suggestion, but I don't really see how changes to how my model is expressed can make a difference in my problem.

Wouter Lievens 2008-12-19 09:17:54

Except of course if I make less data computed, and more data explicitly stored in fields, but that's really a form of denormalization.

Wouter Lievens 2008-12-19 09:18:43

I think that's valid; I haven't found the modelling techniques to be universally valid. Quite often you can choose among multiple approaches to the domain model, but not always.

krosenvold 2008-12-19 16:29:11

+1 A:

In my experience this is the only way to do things. If you write a system that attempts to completely hide or abstract the persistence layer then there is no way that you can optimise things using the specifics of the persistence layer.

I have been running up against this issue recently and have been working on a solution where persistence layers can choose to implement interfaces that represent optimisations. I have just been playing with it but to use your ListAUsers example it goes like this...

First write a ListAllUsers method that does everything in the domain level. For a while this will work, Then it will begin get too slow.

When using the rich domain model gets slow create an interface called "IListActiveUsers" (or probably something better). And have your persistence code implement this Interface using whetever tecniques are appropriate (probably optimised SQL).

Now you can write a layer that checks these interfaces and calls the specific method if it exists.

This isn't perfect and I dont have a lot of experience with this sort of thing. But it seems to me that the key is to ensure that if you are using a totally naive persistence method then all code should still work. Any optimization needs to be done as an addition to this.

Jack Ryan 2008-12-18 17:39:41

What you describe is pretty much exactly how I do it . Except my queries are methods in an abstract QueryModule class, rather than separate interface/implementation pairs, but that's really not a major difference. At least I feel less lonely now :-)

Wouter Lievens 2008-12-19 09:14:13

Udi Dahan talks about this technique in the following presentation:http://www.infoq.com/presentations/Making-Roles-Explicit-Udi-Dahan

Jonas Kongslund 2010-08-22 09:06:17

ansaurus

tags:

views:

answers:

Scaling a rich domain model

related questions