views:

178

answers:

7

background: we've got a number of server processes and client apps that are used entirely internally, in a fairly controlled environment. we capture a significant amount of data every day that goes into a couple database machines. most everything is c#, with a few c++ apps.

just about every app has some basic (if not extensive) dependence on database data, whether it's for historical data, daily-calculated values, or assorted parameters. as the whole environment has gotten a bit more sprawling, I've been wondering about the sense in sticking an intermediary in between all client and server apps and the database, a sort of "database data broker". any app that needs values from the db makes a request to the data broker, instead of a dll wrapper function that calls a stored proc.

one immediate downside is that the data would make two trips across the network: from db to broker, and from broker to calling app. seems like poor form, but the amount of data would be small enough in each request that I'm ok with it as far as performance goes.

one (seeming) upside is that it would be trivial to set up a test environment, as it would entail just setting up a test data broker, and there's no maintaining of db connection strings locally anywhere else. also, I've been pondering creating a mini request language so you wouldn't have to enumerate functions for each dataset you might request (instead of GetX() and GetY(), there would be Get("name = X")

am I over-engineering this, or is it possibly a worthy architecture?

edit: thanks for all the great comments so far, great food for thought.

+2  A: 

only you can answer these:

  • what are the benefits of doing this?
  • what are the problems/risks of doing this?
  • do you need this to make testing easier or even possible?
  • if you make this change and when it goes live and crashes will you be fired?
  • if you make the changes and it goes live will you get a promotion?
  • etc...
KM
A: 

A data broker sounds like a really good way to abstract out the multiple data sources for your apps. It would be easy to consolidate, change repositories, or otherwise move data around if needed in the future.

Miguel
+4  A: 

It depends on what you're trying to accomplish with it. According to Rocky Lhotka, you should only add a tier if you are forced to, kicking and screaming all the way.

I agree with him: don't tier unless you need to. I think there are valid reasons to add additional tiers, usually for purposes of security, scalability and maintainability. The question becomes: is yours a valid reason?

It looks like the major reason is maintainability. Does it outweigh the benefits you get by not having the tier?

Randolpho
A: 

I may be misunderstanding something, but it seems to me like you should consider some entity framework. That is a framework you can use to "map" your interaction with the db to some domain objects. That way you work locally on domain objects that gets filled form your db, and when it is time to persist the state of your objects to the base, the framework handles all the connections back and forth. In this way you can also easily mock up these domain objects for unit testing without needing a db connection.

Check out NHibernate for a good entity framework alternative.

erikric
A: 

If you already have the database related know-how I think it's not a bad decission.

Good things that I can think of:

  • if the data model is consistent you can plug in new tools easily without making any changes in the other apps.
  • maybe you can have running the database more reliabily than your apps, so if one of them fails, the other one can still be working.
  • you can make backups and rollbacks using the database tools.
  • you can do emergency fixes manipulating the data directly with sql or some visual tool.

But if you have to learn new frameworks along the way, maybe the benefits are not worth the extra initial effort.

fortran
>>But if you have to learn new frameworks along the way, maybe the benefits are not worth the extra initial effort.Or, its' especially worth the initial effort ( if you're into that kind of thing :) )
AlexCuse
yes, of course learning new things is worth by itself _in the long term_... but if there are schedules to comply with, maybe it's not the best moment :)
fortran
+1  A: 

As the former architect of a system that also used a database heavily as a "hub," I can say that there are several drawbacks that you should be aware of. Our system used databases:

  • As a transaction store (typical OLTP stuff)
  • As a staging queue (submitted but unprocessed transactions)
  • As a historical data store (results of processed transactions)
  • As an interoperation layer (untranslated commands or transactions issued from other systems)

One of the major drawbacks is ownership costs. When your databases become the single point of failure for so many types of operations, it becomes necessary to ensure that they are all hosted in high-availability environments. This not only expensive from a hardware perspective, but it is also expensive to support deployments to HA environments, since developers typically have very limited visibility to the internals.

A second drawback is that you have to seriously design integrity in to all of your tables. In a typical SOA environment, you have complete control over how data is modified. When you expose it through database tables, you must consider that any application with the right credentials will have the ability to modify data. Because of this, you must carefully consider utilitarian implementations of constraints. If you had a single service managing persistence, you could be much looser in constraints on the database and enforce them in code.

Third, if you ever want to expose any functionality that the database tables currently allow you to provide to outside parties, you must write service code anyway, so you might be better served doing it strategically as opposed to reacting to requests.

Fourth, UI interaction directly with the data layer creates security risks, especially if the client is a thick client.

Finally, writing code that responds to events (service calls) is much easier than polling code. Typically, organizations that rely heavily on database polling end up reinventing the wheel every time a new project requires a new "monitoring service." It can be avoided by creating a "framework," but those have their own pitfalls (primarily around prescription versus adoption).

This is just a laundry list of problems I have encountered. It's not necessarily meant to dissuade you from using databases for these functions, but it helps to know the dangers ahead of time so you can at least plan for them if they ever do become issues.

EDIT

Just thought of another scenario that caused us pains. Versioning your changes can be difficult. For example, if you need to change the shape of a table (normalize/denormalize), it has a cascading effect if multiple applications rely on it. In a SOA scenario, it is much easier, because you can keep your old API, change the internal interaction so that it works with the changed tables, and allow consumers to migrate to the new version on their own schedule.

Michael Meadows
A: 

"any app that needs values from the db makes a request to the data broker"

When database technology was being invented over 40 years ago, the people doing that inventing had ideas along the lines of "any app that needs values from the db makes a request to the dbms".

Have you ever pondered the possibility that YOU ALREADY HAVE a "data broker", and that there might be very little added value in creating a second one of your own ?