views:

298

answers:

9

Hello,

I'm designing a few applications that will share 2 or 3 database tables and all of the other tables will be independent of each app. The shared databases contain mostly user information, and there might occur the case where other tables need to be shared, but that's my instinct speaking.

I'm leaning over the one database for all applications solution because I want to have referential integrity, and I won't have to keep the same information up to date in each of the databases, but I'm probably going to end with a database of 100+ tables where only groups of ten tables will have related information.

The database per application approach helps me keep everything more organized, but I don't know a way to keep the related tables in all databases up to date.

So, the basic question is: which of both approaches do you recommend?

Thanks,

Jorge Vargas.

Edit 1:

When I talk about not being able to have referential integrity, it's because there's no way to have foreign keys in tables when those tables are in different databases, and at least one of the tables per application will need a foreign key to one of the shared tables.

Edit 2:

Links to related questions:

Only the second one has an accepted answer. Still haven't decided what to do.

Answer:

I've decided to go with a database per application with cross-database references to a shared database, adding views to each database mimicking the tables in the shared database, and using NHibernate as my ORM. As the membership system I'll be using the asp.net one.

I'll also use triggers and logical deletes to try and keep to a minimum the number of ID's I'll have flying around livin' la vida loca without a parent. The development effort needed to keep databases synced is too much and the payoff is too little (as you all have pointed out). So, I'd rather fight my way through orphaned records.

Since using an ORM and Views was first suggested by svinto, he gets the correct answer.

Thanks to all for helping me out with this tough decision.

+1  A: 

It depends and your options are a bit different depending on the database and frameworks you're using. I'd recommend using some sort of ORM and that way you don't need to bother that much. Anyways you could probably put each app in it's own schema in the database and then either reference the shared tables by schemaname.tablename or create views in each application schema that's just a SELECT * FROM schemaname.tablename and then code against that view.

svinto
What do you mean by I won't need to bother if I'm using some sort of ORM? How can an ORM help me solve my problem? If you could give some pointers on this I'd really appreciate it.I kind of like the view approach you mention, but I'll still have the referential integrity problem. I'm gonna have to give it a try to see how it works.
Jorge Vargas
Point your foreign keys to the views and you're fine with the referential integrity.If you use a ORM you code to that instead of the tables, so you don't need to bother with the whole lot except when creating new tables or setting indexes, keys, etcetera.
svinto
Is it possible to point foreign keys to views with MS SQL Server 2005? This answer http://stackoverflow.com/questions/447122/can-i-have-a-foreign-key-referencing-a-column-in-a-view-in-sql-server/447151#447151 says that it isn't possible. I'll try to do it with SQL code since the designer only showed me tables and I'll let you know if I can make this work. Thanks.
Jorge Vargas
Nope, this didn't work. Got the same errors as the guy posting the question.
Jorge Vargas
I think that depends on what database you're using.
svinto
+1  A: 

Hi

One database for all application in my opinion .Data would be stored once no repitation.

With the other approach you would end up replicating and in my opinion when you start replicating it will bring its own headache and data would go out of sync too

Lisa
Yeah, that's my main issue with the database per application approach. I could probably solve this if I have something like a "MainDatabase" that has the shared data with some triggers to update the different databases, and creating an Admin app to manage that "MainDatabase".
Jorge Vargas
I like this, main DB, being a master source of "common" table would be close to ideal if you're fine with temporary cross-app data mismatches because of replication delays.
bobah
Lisa - what about security? One app can, if not taken care of, can access other app's content.
MasterGaurav
+3  A: 

Neither way looks ideal :)

I think you should consider not making a references in database layer for cross-application relations, and make them in application layer. That would allow you to split it to one database per app.

I'm working on one app with 100+ tables. I have them in one database, and are separated by prefixes - each table has prefix for module it belongs to. Than i have built layer on top of database functions to use this custom groups. I'm also building data administrator, which takes advantage of this table groups and makes editing data very easy.

praksant
Ha! Never saw that answer coming.Thanks for sharing your real life experience. I'm also going to build a data administration app, and in the beginning it will only need to access the shared (user information) tables. Now that I think about this, I should also have mentioned this in the question. Anyways, in this application the administrator will choose which users have access to which application. That's another reason why I lean over the one database approach.
Jorge Vargas
+1  A: 

There are no hard and fast rules to choose one over the other.

Multiple databases provide modularity. As far as sync-ing across multiple databases are concerned, one can use the concept of linked servers and views thereof and can gain the advantages of integrated database (unified access) as well.

Also, keeping multiple databases can help better management of security, data, backup & restore, replication, scaling out etc!

My 2cents.

MasterGaurav
Yeah, I know. That's why I'm asking for the opinions of the experts! The most feedback I can get will help me make the most informed opinion. You're the third person that had told me more benefits of multiple databases than the single database approach. Thanks.
Jorge Vargas
+1  A: 

The most appropriate approach from scalability and maintenence point of view would be to make the "shared/common" tables subset self-sufficient and put it to "commons" database, for all others have 1 db per application of per logical scope (business logic determined) and maintain this structure always

This will ease the planning and execution commissioning/decommissioning/relocation/maintenence procedures of your software (you will know exactly which two affected DBs (commons+app_specific) are involved if you know which app you are going to touch and vice versa.

bobah
Those are good things to keep in mind. Anything that helps keep maintenance easier is something to consider strongly, and knowing which tables I'm gonna be changing when modifying an app should (not must) help me make less errors. Other thing to notice is that I'm the only developer working on this, no team right now, but the multiple database approach should be an easier to grasp concept for the developer after me.
Jorge Vargas
You might also want in future to deploy N instances of app1 (failover, or serving internal/external clients or whatever) and M instances of app2 (same reason but M != N). And if you are concerned about data referential consistency it can be worth having some simple reconciliation script running daily and sending a report to people in charge in case of problems
bobah
+2  A: 

THat does not sound like "a lot of applications" at all, but like "one application system with different executables". Naturally they can share one database. Make smart usage of Schemata to isolate the different funcational areas within one database.

TomTom
They are a lot of applications, they only happen to share the same user base and a need to have foreign keys to "UserProfiles" tables, but their functionality it's totally different.
Jorge Vargas
Separate databaes then, and a separate membership subsystem ;)
TomTom
+1  A: 

At our business, we went with a separate database per application, with cross database references for the small amount of shared information and an occasional linked server. This has worked pretty well with a development, staging, build and production environments.

For users, our entire user base is on windows. We use Active Directory to manage the users with application references to groups, so that the apps don't have to manage users, which is nice. We did not centralize the group management, that is each application has tables for groups and security which is not so nice but works.

I would recommend, that if your applications are really different, to have a database per application. Looking back, the central shared database for users sounds workable as well.

You can use triggers for cross database referential integrity:

Create a linked server to the server that holds the database that you want to reference. Then use 4-part naming to reference the table in the remote database that holds the reference data. Then put this in the insert and update triggers on the table.

EXAMPLE(assumes single row inserts and updates):

DECLARE @ref (datatype appropriate to your field)

SELECT @ref = refField FROM inserted

IF NOT EXISTS (SELECT *
FROM referenceserver.refDB.dbo.refTable
WHERE refField = @ref)
BEGIN
RAISERROR(...)
ROLLBACK TRAN
END

To do multi row inserts and updates you can join the tables on the reference field but it can be very slow.

Decker97
I'd love to use Active Directory, that way I wouldn't have any of the problems I mention here, but we're using other ways to authenticate people. I could go with cross database references and linked servers, but that way I'd lose referential integrity (at least in sql server, I don't know if that would happen in another RDBMS), which I'd like to keep. This is the first time I face this problem, so I'm lost here. None of the solutions seem right.
Jorge Vargas
+1  A: 

I think the answer to this question depends entirely on your non functional requirements. If you are designing a application that will one day need to be deployed across 100's of nodes then you need to design your database so that if need be it could be horizontally scaled. If on the other hand this application is to be used by a hand full of users and may have a short shelf life then you approach will be different. I have recently listened to a pod cast of how EBAY's architecture is set-up, http://www.se-radio.net/podcast/2008-09/episode-109-ebay039s-architecture-principles-randy-shoup, and they have a database per application stream and they use sharding to split tables across physical nodes. Now their non-functional requirements are that the system is available 24/7, is fast, can support thousands of users and that is does not lose any important data. EBAY make millions of pounds and so can support the effort that this takes to develop and maintain.

Anyway this does not answer your question:) my personnel opinion would be to make sure your non-functional requirements have been documented and signed off by someone. That way you can decide on the best Architecture. I would be tempted to have each application using its own database and a central database for shared data. And I would try to minimise the dependencies between them, which I'm sure is not easy or you would have done it:), but I would also try to steer clear of having to produce some sort of middle ware software to keep tables in sync as this could create a headaches for you.

At the end of the day you need to get your system up and running and the guys with the pointy hair wont give a monkeys chuff about how cool your design is.

The Abbott of Monkseaton
Thanks for the podcast link, I'll listen to it as soon as I have a chance. I know pointy haired guys aren't gonna notice my design, but I and other developers will, and the less painful I can make it for me the better for the developers after me.
Jorge Vargas
+1  A: 

We went for splitting the database down, and having one common database for all the shared tables. Due to them all being on the save SQL Server instance it didn't affect the cost of running queries across multiple database.

The key in replication for us was that whole server was on a Virtual Machine (VM), so for replication to create Dev/Test environments, IT Support would just create a copy of that image and restore additional copies when required.

kevchadders