Design problem: shared, synchronized, data access in application... the best approach?

views:

150

answers:

+1 Q:

Design problem: shared, synchronized, data access in application... the best approach?

I have a web application (asp.net) where work items are submitted and allocated to users according to their workload.

A user's workload is calculated using an algorithm based on the number of items currently allocated to the user (and some other factors). Each new item is allocated to the user with the lowest current workload, increasing their workload. A user's workload decreases when they finish an item. Items will be submitted, allocated and finished concurrently - hence workload levels will be continually changing. Work items are stored in a SQL database.

I need a way to ensure that every allocation decision is made using an up-to-date picture of workload across the user base.

My idea is to provide a read/write synchronized store of workload information in the Cache.

Is this the best approach? Or should I use the database to control locking? How do I avoid bottle-necks in my application?

Advice much appreciated.

I would use a database to control this, as it's highly unlikely users will be completing work quickly enough to need a more real-time approach.

So I would have a series of tables, relating to work items, which you can query to calculate current work levels, and thus determine the next person to receive a particular work item.

You could then have a series of stored procedures to mark work items as complete, or to allocate a work item to someone.

To minimize bottle-necks ensure your database is normalised well, and your stored procedures do not use many tables, and this should ramp up nicely.

To check this, you could write some test harnesses to ensure the allocation of work items, and performance is what you desire under high usage.

Bravax 2008-10-14 21:05:41

+1 A:

This depends on many factors, when you refer to the cache do you mean the standard cache provided by Asp.Net?

Is it absolutely critical that you always have the most up to date information, or if two requests are made to be allocated is it ok for them to get allocated to the two least busy users at the moment the request was made?

You can indeed use a cache to store this information, however this generally assumes you will only be using one server, is it likely that you will be using clustering or load balancing for high loads?

The best advice I can give you is to build a well designed application, with a rich domain model representing the loads of each user and the queue and loosely coupled data access and plenty of automated unit and system tests. This way you can build a working application, get the system up and running quickly without worrying to much about optimisation and start performance testing\profiling as soon as possible.

If\when you encounter performance issues, you can then identity the bottlenecks by profiling\tracing and add optimisation as appropriate, which might be caching or optimised queries\views or combinations of things.

If you try to second guess where you're bottlenecks are and remove them, you will likely guess wrong and damage the design of the system. A well designed system can be optimised when you need it.

The way I would initially envisage this, would be a relational database (possibly using views or stored procedures to grab workload summary information quickly) with a data mapping layer between this and the domain model (which can use caching, lazy loading and identity mapping to provide efficiencies if needed). The domain model would largely hold a representation of the workload.

I would expect this to have users, work items, work queue and allocation strategy classes. Much of this could be kept in memory or stored locally between requests, each request could be represented by an event which would update the model.

E.g.
User completes a work item
Site raises a domain event to notify the domain model of change
Domain model receives event and updates work load for user

Work allocation would then be a matter of asking the domain model to allocate the work (which it would do via a strategy of finding the least allocated user) when needed. This could happen in the background outside of individual work requests and events raised to notify users next time they ask for work.

marcj 2008-10-14 21:06:54

+1 A:

Use the database to control this.

If for some reason, in the future, you need to expand out to utilize a web farm then you won't have a problem. However, if you are caching the data and working locally then it will cause interesting things to happen.

Also, you can take advantage of web garden settings to help manage any load your server would have; which isn't exactly possible in a cached situation.

Second, realize that this probably doesn't have to be perfect. If you have a high amount of work coming in (like distributing leads in a call center) then just getting really close as long as it's fast is all that matters.

Chris Lively 2008-10-14 21:25:55

Thanks for the update Marc.

The domain model you describe is largely what I have in place, albeit without the layer that handles allocation requests. Based on your advice of not pre-empting bottlenecks, I have created a trigger on my work items table that maintains a current workload table. I think I'm going to start by querying this workload table when allocation is required (rather than cache it) and then see if I can or need to improve performance from there...

flesh 2008-10-15 11:03:03

Sounds like a good, low impact design choice which doesn't limit your optimisation or processing options later. I tend to try and avoid putting much into the database, but it is a great option for speed.

marcj 2008-10-15 17:34:33

ansaurus

tags:

views:

answers:

Design problem: shared, synchronized, data access in application... the best approach?

related questions