CRUD-style data driven distributed .NET application architecture questions

views:

200

answers:

+2 Q:

CRUD-style data driven distributed .NET application architecture questions

Context: Building a smart client application on the .NET platform where you have a complex database model with a high number of columns involved. The natural application style is a typical data driven CRUD. There’s also a fair bit of server side logic in some cases, and somewhat complex validations. You have full control of client and server, so the need for interoperability is at a minimum.

This question has a lot of details, apologies for that, but it is because I want to set the proper context for the answers.

A few more assumptions
- As is not uncommon in the Microsoft world, most previous applications have been written with DataSets, so it is the best known technology for the developers involved. But let’s say the developers are well versed with OO thinking as well.
- You will need to run validations on both client and server.
- You don’t show most data in tabular form.
- This is not an intranet application, so you can’t assume too much about the bandwidth

The biggest question: Datasets or objects?

If you go for datasets you have a few positives and negatives
- In terms of positives: You get a bit of Microsoft support in terms of getting the data out of the database, getting the data over the network and returning changed data over the network in smaller chunks – since you can specify only to send changes. Sending less data is good since there’s potentially quite a bit of data involved.
- The negatives are: In terms of validation, business logic and so on, you get a procedural form of code and you don’t get the benefits of object oriented code –behavior and data together, a more natural style of working and thinking about what you’re doing, and possibly closer ties to the validation logic. You can also look away from the benefit of putting the dataset in a grid, since that is not the common use case.

If you go for objects, it’s the same drill, but there are more options involved:
Positives: Behavior and data together. Validation logic closer. Easier to see and understand the relationships between objects. More readable code. Easier to unit test. But there’s quite a few choices and work you need to do as well:

OR/Mapping
- Getting the data from the relational model to objects. OR-mappers aren’t that complex, and will be able to handle it well. But it adds on to the development time.

Contract mapping
- It’s generally good practice to map data from server-side objects to contract objects, likely DTO’s. Since this is an application with a good fit for CRUD style architecture, DTO’s don’t really add much value to the picture, just mapping work.

Shared code
- You can go for a shared code scenario, where the assembly with the domain data and logic is available on both client and server side. That’s tight coupling, but it’s not necessarily bad when you have a naturally tightly coupled client-server app.

Whether you choose adding a contract layer or not, you have large object structures that must be sent over the wire Since we’re controlling both client and server, the transport and encoding should be binary encoding over TCP. That will help. With datasets, you have the option of only sending the changes back. Sending the entire object structure back and forth is a likely performance issue. An option to sending the entire object structure, is to somehow identify the changes involved (Create, Update, Delete), and sending only information about that. In theory it’s not too hard to send the aggregate root ID to the server as well as the changes, ask the server to lazy load the aggregate root, perform the changes made, and then save again. But the big complexity involved is identifying the changes done. Do you ever go for this approach? Why? How exactly do you do it?

Presentation
The exact UI technology is not really that important for the question, WinForms, Silverlight or WPF is possible. Let’s take the assumption that we are using WPF since it’s a new smart client. That means we have two way binding and can use MVVM properly.

The objects bound to in the user interface will need to implement INotifyPropertyChanged and raise an event each time a property is updated. How do you solve this? If you go for the shared code scenario, you can add it to the domain objects, but it that will involve adding code and logic on the server side that is never meant to be used there. The separation is more natural if you go for contract objects, but that’s not a lot of value added just to add a layer of mapping.

Technologies
There are a few technologies available that can help with solving some of the issues, but that often complicate others. Do you use them, or build things from the ground up yourself?
**
- CSLA is possible, but it makes unit testing harder, and seems to add a tighter coupling to data access. It does help with a number of the problems, but personally I have no competence with this technology, so whether it is a great fit is a bit hard to say.
- WCF RIA Services would be possible for a Silverlight solution, but there are definitely limitations involved. Size of data is one.
- WCF Data Services is another approach to getting something quickly up, but REST isn’t much help, and you also lack the validation support you have in RIA Services.

Summary
If you’ve gotten this far, I hope you have an idea of where I’m going with this. I’ve tried to scope it down to avoid talking about everything at once, but distributed development is complex, so you have to consider many parts.

Update

Thanks for the responses guys! I was trying to ask the question open enough to open for varying answers, but specific enough to deal with a few not uncommon requirements.

There are different considerations that have different pros and cons, and which varies from system to system. Each usually adds to the complexity of finding a solution. One of the points of this questions was to get answers particularily with a few extra requirements that don't neccessarily fit directly to the one answer that is often the correct one today - with task based UI. I'm not a "CRUD-guy", if you will. But a few systems, for various reasons (most often legacy), have a good fit for CRUD.

Many business apps have similar demands that pull in different directions:

Business related
- View: Showing data to the user and updating the same data (Reads and CUDs - Create, Update, Delete)
- Validation: Business rules

UI related
- Validation: UI rules
- UI updates: Code specific to just getting the UI to update on object changes (INotifyPropertyChanged)

Network related
- Data size: Amount of data you send over the wire

DB related
- Lazy loading

SRP/reuse related
- Mapping: Caused by multiple layers of objects / separating concerns

Maintenance/change related
- Changes: Adding new information (columns/fields)
- Amount of code
- Reuse and "reasons to change"

Technical limitions
- Change tracking

But these are just some very specific ones. You always need to know which "-ilities" you find most important, and thus what degree of scalability, availability, extensibility, interoperability, usability, maintainability and testability you need.

If I would try to generalize something for most situations, I'd say something like:

Client
- Use MVVM for separation and testability
- Create the VM on top of DTOs
- Implement INotifyPropertyChanged in the VM.
- Using XamlPowerToys, Postsharp or some other means of helping with this can be worthwhile
- Separate Reads and CUDs in the UI
- Make CUDs task based, and use commands or similar to send those operations to the server side

Server
- Tailor-make a dto per screen
- OR use the multi-query approach described by Ayende in http://msdn.microsoft.com/en-us/magazine/ff796225.aspx
- Use automapping to avoid the tedious, manual and completely unrelated to the problem you are trying to solve step, that mapping is
- Let the domain model be concerned with business operations primarily, including the CUD-related operations, and not reads
- Avoid reusability that adds to the number of reasons to change
- Avoid encapsulation issues
- (And by that enable CQRS style architecture and possibly separate scaling of reads and CUDs in time)
- Try to find a validation approach that fits well with what should be done (Good read: http://www.lostechies.com/blogs/jimmy_bogard/archive/2009/02/15/validation-in-a-ddd-world.aspx)

Is this neccessarily the approach I would take in this particular situation?

Well, that's what I wanted to start a discussion on :) But seems that was harder than I hoped (besides the two of you).

+1 A:

Interesting problem :)

If you start out with a few principles:

Try to reduce the amount of data sent over the wire
Try to minimise the amount of time spent writing plumbing code
Try to improve testability

Based on that I would:

Use POCO objects to transfer data. DataSets include a lot of information that you may not need
Use Entity Framework POCO for database access, saves you mapping from contract objects to data objects
Place validation in helper classes, easy to test, and supports shared code model

In ours projects we have saved time using Entity Framework compared with Enterprise Library and DataSets.

On the Server side and client side objects you could try:

The client side object inherits the server side object and implement INotifyPropertyChanged
Place the client side and server side object in seperate dll's that way there is no unused code on the server
use Automapper to map between the two types. (may be a better way using interfaces)

Shiraz Bhaiji 2010-09-29 16:28:09

I certainly don't disagree with the principles :) But especially the part about having quite a bit of data is supposed to be a part of this problem. Yes, datasets include sending more data than an object model, and personally I don't like going the dataset way, but it does have a handy feature of only sending the changes after a change (= a lot less data). Not that that is the killer feature, but something similar would be nice for an object model where going the commands way is not really an option

Rune Sundling 2010-09-29 16:52:40

You could try the self tracking entities in Entity framework http://blogs.msdn.com/b/efdesign/archive/2009/03/24/self-tracking-entities-in-the-entity-framework.aspx, this is a middle ground between DTO's and DataSets.

Shiraz Bhaiji 2010-09-29 17:10:09

+2 A:

I can only answer from our own experience. We have tried different frameworks (WCF RIA , Ideblade) and have concluded that frameworks will only make things worse. I will explain further down.

First of all you should forget about CRUD. Only demo apps has CRUD - real world apps have behaviour.

I do not recommend to mimick the whole entity-graph on the client-side. They are two separated concerns.

You should create tailor-made Dto's for each context. E.g. let's say you have an OrderSearchView, then you go create an OrderSearchDto and map only fields that you need. In the EditOrderView, you would instead use an EditOrderDto - which contains only the fields you need.

I wouldnt really recommend using an automapping tool between the entities and the dto's. Because often there isn't a one-to-one- relationsip between the dto and the entity. The dto would often be built by different multiple backend entities. Mapping is so easy anyway, so I dont see the point with a mapping framework. And the job isn't the mapping - it's writing the unit test - which you would have to do anyway (with or without a mapping framework).

Dtos should be agnostic about client side technology. And implementing INotifyPropertyChanged on the dto is breaking the Single Responsibility Principle. There is a resion they are called Data Transfer Objects. Instead you create Presenters on the client side. You create an EditOrderPresenter that is a wrapper around the EditOrderDto. So the dto will just be a private member field inside the EditOrderPresenter. The Presenter is tailored made for editing in the client layer - so it would usually implement INotifyPropertyChanged. The EditOrderPresenter would usually have the same property names the dto has.

You should physically separate the client validation from the entity validition on the server side. Beware the share ! I think client validation is just GUI tweaking - to make the gui experience better. Dont make big a point of having shared validation code between dto and entity - it might cause more headache than usefulness. Just make sure that you always validate on the server side regardless of what kind of validation is done on the client side. There are two kinds of validations: simple property validation and the-whole-entity-validation (same goes for dto). The entity-validation should only be performed at state-transition. Check out Jimmy Nilssons Domain Driven Design for background knowledge. I would not recommend using a validation rule engine - just use the state pattern.

Then what about updates, insert, deletes ? In our implementations, we use WCF, and the WCF API has only one method: IResponse[] Process(params IRequest[] requests); What does this really mean ? This means that the client is issuing a batch of requests to the server. On the serverside you implement a RequestHandler for every request that is defined in the system. Then you return a list of responses. Make sure that the Process() method is one unit-of-work (~one transaction). This means that if one the requests in the batch fail - all of them will fail - and this would cause a Rollback on transaction - and no harm is done to the db. (Dont use errorcodes in the responsehandlers - cast exception instead.)

I would recommend that you peek into the Agatha messaging server. Davy Brion has great blogposts about the messaging layer. In our company we choose to implement our own messaging server - because we didn't need all what Agatha offered we made some syntax improvements. Anyway, implementing a messaging server isn't really difficult - and it's a nice learning experience. Link http://davybrion.com/blog/

Then what do you do with the Dto's. Well, you never update them, but you change them on the client side in order to get proper feedback to the gui. So, you make the presenters track every thing that is happening to the dto (reqest) - in the correct order. This will be your requestBatch. Then send the requestbatch to the process-command on WCF - then the requests will be "replayed" on the serverside and handled by the requesthandlers. This actually mean that you never update the dto's. But the presenters might edit the dto on the clientside in order to give proper gui feedback. The presenters job is also to keep track of all edits done , in order to issue them back to server as a requestbatch (with requets in the same order as they are edited) . Think of the following scenario, you retrieve an existing order, you edit, then you commit the changes back to the db. This would result in two batches, one for getting the order and one for commiting the changes back.
RequestBatch 1: GetOrderByIdRequest

(..then user edits data..)

ReqeuestBatch 2:
StartEditOrderRequest , state-change to edit modus, relaxed validation
AddConsigneeToOrderRequest
ChangeEarliestETDOnOrderRequest , no need for validation againts latest ETD yet !
DeleteOrderlineRequest
ChangeNumberOfUnitsOnOrderlineRequest
EndEditOrderRequest , state-change to original state, perform entity validation here!
GetOrderByIdRequest , in order to update the gui with latest changes.

On the serveside we use NHibernate. Nhibernate uses the first level cache to avoid heavy db load. So all requests within the same unit-of-work (requestbatch) will use the cache.

Each request should only contains the minimum amount of data do the job. This means using the OrderId + some other properties instead of the whole dto. Regarding optimistic update, you could send some of the oldValues along with the request - this is called the Concurrency Set. Remember that the concurrency set usually doesnt contains alot of fields. Because updating order that has been changed in the meantime doesnt necessarily mean you will have raise condition. E..g. adding and orderline while in the meantime a consignee was edited by another user doesnt mean you have a raise condition.

Well, doesnt this lead to an awful lot of work. You will certainly have a lot more classes, but each class will be small and have a single responsibility.

BTW we tried WCF RIA services in a middle sized project. And it didnt go that well. We had to find ways (hacks) around the framework to do what we wanted. And its also based on code generation - which is quite bad for a build server. Also, you should never make visibility through layers. You should be able to change the backed entities without affecting the client layer. With RIA this is very hard. I think OData goes in the same category as as WCF RIA.

If you need to build queries on the client side, you use specification pattern - dont use iqueryable - then you will be independant of the backend entities.

Good luck.
twitter: @lroal

Lars-Erik Roald 2010-09-30 10:55:55

ansaurus

tags:

views:

answers:

CRUD-style data driven distributed .NET application architecture questions

related questions