views:

120

answers:

4

We are in the early stages for putting a design for a big business application that will have multiple modules. One of the requirements is that the application should be database independent, it should support sql server, oracle, mysql and DB2.

From what I have read on the web is that database independence is a very bad idea, it would result in a hard to maintain code,database design with the least common features in all supported DBMSs, bad performance and bad scalability. My personal gut feeling is that this is the most feature ever that could increase the development cost and time exponentially with complexity. The code will be dreadful.

But I cannot persuade anybody to ignore this feature. The problem is that most data on this issue are empirical data, no numbers to support the case. If any one has any numbers supported data on the issue I would appreciate it.

One of the possible design options is to use Entity framework for the database tier with provider for each DBMS. My personal feeling is that writing sql statements manually without any ORM would be a must since you have no control on the sql generated by entity framework and a database independent scenario will need some sql tweaking based on the DBMS the code is targeting, and I think that third party entity framework providers will have a significant amount of bugs that only appears in complex scenarios that the application will have. I would like to hear from anyone who had an experience with using entity framework for database independent scenario before.

Also one of the arguments discussed in the team is to support one DBMS (sql server for example) in the first iteration and then add support for other DBMSs in next iterations. I think that since we will need a database design with the least common features this development strategy is bad, since we need to know all the features of all databases before we start writing code for the first DBMS. I need to hear from you about this point too.

+1  A: 

I work with Hibernate which gives me the benefits of the ORM plus the database independence. Database specific features are out of the question and this usually improves my design. Everything (domain model, business logic and data access methods) are testable so development is not painful.

cherouvim
+1 for improving design - seems that genuine relational modelling is increasingly a foreign concept
Steve De Caux
A: 

Database independence is an overrated application feature. In reality, it is very rare for a large business application to be moved onto a new database platform after it's built and deployed. You can also miss out on DBMS specific features and optimisations.

That said, if you really want to include database independence, you might be best to write all your database access code against interfaces or abstract classes, like those used in the .NET System.Data.Common namespace (DbConnection, DbCommand, etc.) or use an O/RM library that supports multiple databases like NHibernate.

cxfx
It's not *very* rare. If you develop a software product which will serve as your product line and you'll be customizing for many customers, then you probably need this. We have 2 of those at work.
cherouvim
Certainly, in the cases where one of your product's main features is to be customised and integrated with heterogeneous client platforms it's not rare. But, for general applications, it's very rare.
cxfx
@cherouvim: I've never seen a DBMS port. I've seen plenty of client ports. But then I work for a large corporate, not a software house.
gbn
Database independence is an application feature Database Salesman dislike intensely, it shafts their lock-in business model. Time to smell the roses.
Steve De Caux
@Steve: as opposed to ORM vendors...?
gbn
+1  A: 

مرحبا , Muhammed!

Database independence is neither "good" nor "bad". It is a design decision; it is a trade-off.

Let's talk about the choices:

It would result in a hard to maintain code
This is the choice of your programmers. If you make your code database-independent, then you should use a layer between your code and the database. The best kind of layer is one that someone else has written.

...Database design with the least common features in all supported DBMSs
This is, by definition, true. Luckily, the common features in all supported databases are fairly broad; they should all implement the SQL-99 standard.

...bad performance and bad scalability This should not be true. The layer should add minimal cost to the database.

...this is the most feature ever that could increase the development cost and time exponentially with complexity. The code will be dreadful. Again, I recommend that you use a layer between your code and the database.

You didn't specify which language or platform you're writing for. Luckily, many languages have already abstracted out databases:

Good luck.

Chip Uni
The bad performance comes from using the least feature set. For example you cannot use auto incremented identity since it is not supported in some DBMSs. Ignoring a lot of these features will be overridden by multiple trips to the database which will hurt performance and scalability.
Muhammad Adel
Ah, true enough, Muhammed.
Chip Uni
+3  A: 

Have you looked at Comparison of different SQL implementations ?

This is an interesting comparison, I believe it is reasonably current.

Designing a good relational data model for your application should be database agnostic, for the simple reason that all RDBMSs are designed to support the features of relational data models.

On the other hand, implementation of the model is normally influenced by the personal preferences of the people specifying the implementation. Everybody has their own slant on doing things, for instance you mention autoincremented identity in a comment above. These personal preferences for implementation are the hurdles that can limit portability.

Reading between the lines, the requirement for database independence has been handed down from above, with the instruction to make it so. It also seems likely that the application is intended for sale rather than in-house use. In context, the database preference of potential clients is unkown at this stage.

Given such requirements, then the practical questions include:

  1. who will champion each specific database for design and development ? This is important, inasmuch as the personal preferences for implementation of each of these people need to be reconciled to achieve a database-neutral solution. If a specific database has no champion, chances are that implementing the application on this database will be poorly done, if at all.
  2. who has the depth of database experience to act as moderator for the champions ? This person will have to make some hard decisions at times, but horsetrading is part of the fun.
  3. will the programming team be productive without all of their personal favourite features ? Stored procedures, triggers etc. are the least transportable features between RDBMs.

The specification of the application itself will also need to include a clear distinction between database-agnostic and database specific design elements/chapters/modules/whatever. Amongst other things, this allows implementation with one DBMS first, with a defined effort required to implement for each subsequent DBMS.

Database-agnostic parts should include all of the DML, or ORM if you use one.

Database-specific parts should be more-or-less limited to installation and drivers.

Believe it or not, vanilla-flavoured sql is still a very powerful programming language, and personally I find it unlikely that you cannot create a performant application without database-specific features, if you wish to.

In summary, designing database-agnostic applications is an extension of a simple precept:


Encapsulate what varies


Steve De Caux