views:

134

answers:

6

Note: I have no intention of implementing this, it's more of a thought experiment.

Suppose I had multiple services available through a web interface. At least two of which required user registration and some data in a database. A single registration would grant access to all services. Like Google (GMail, Google Docs, etc.).

Would all of these services, which are related to registered users, be located within a single database, perhaps with table-prefixes for what service they were for?

Or would each service have it's own database? The only plus I can see to doing this is that it would make table names cleaner. Any time any user interaction would be needed, interacting with at least two different databases would be needed, which would needlessly complicate sql queries.

Would this suggest that the 'big boys' use only a single database, and load it with tons of different (and perhaps completely unrelated) tables?

+3  A: 

I do not think it is a good idea to make multiple services dependent on a single database. If you need to restore some service from a backup, you'll have to restore all. You are overloading a database server probably too. I would do that only if it is likely they will share much data at future point.

Also you might consider smaller database with only the shared user data.

EFraim
Keeping data consistent when restoring in this situation is another challenge.
Dana the Sane
+3  A: 

I would consider having 1 user / role repository with a separate database for services.

Cody C
+1  A: 

There's always a problem with potentially overloading databases and access thereof; replication is one potential good solution.

McWafflestix
+1  A: 

I've never done this, but I think it would depend on performance. If there's almost no overhead to do separate databases, that might be the answer. Doing separate DBs may also make it easy to split DBs across machines.

Complexity is also an issue. Hopefully your schema would be defined in such a way that you wouldn't need to dip into several different databases for different queries.

Chris Simmons
+2  A: 

If you use the right DBMS, you can have the best of both strategies. In PostgreSQL, within a 'database' you can have separate schemas. The authentication service would access a single schema and provide the other services a key which is used as a reference for data in the other schemas. You can still deal with the entire database as a single entity i.e:

  • query across schemas without using dblink
  • store personally identifiable information separately (schemas can have separate per-user permissions to further protect data)
  • DBMS managed foreign key constrains (I believe)
  • consistent (re the data) backup and restore

You get these advantages at the cost of a more complex DAL (may not be supported by your favorite DAL framework) and less portability between DBMS's.

Dana the Sane
That sounds pretty cool. All of my experience happens to be with SQL Server and/or MySQL, so I'll have to look deeper into this.
nilamo
This is also true of SQL Server 2005 and later.
Jason Musgrove
So after some reading, this appears to be the best solution, as it takes the best parts of each different approach. On the one hand, all the tables can be grouped in schemas related to what service they are for, which provides a clean an logical organization. On the other hand, this data, while seperated, is also still easy to access and cross-reference without resorting to multiple queries. Thank you Dana.
nilamo
+1  A: 

There are several strategies.

When you move to multiple databases (or multiple servers), things get more complex. Your core user information could be in a single database. The individual services could be in other databases. The problem with that is that the database is the outer unit of referential integrity, so you cannot design in foreign keys across databases. One way around this is to distribute changes to the core master tables (additions and updates only, obviously, since deletions would be forbidden due to a foreign-key constraint) to separate databases on a regular basis, and then enforce RI against these copies of the core master database tables within the service databases. This also means that the service databases and their services can run while the other databases are down for maintenance. Obviously this is an increased architectural complexity for an improvement to your service windows and reduced coupling.

I would recommend starting with a single database. If your RDBMS supports it, I would organize components according to SCHEMAs which would allow you to at least maintain a logical separation by design. You can more easily refactor later.

Many databases have tables which can be considered unrelated. Sometimes in a system you have multiple entity networks that hardly connect (sometimes not at all). You can use SCHEMAs in these cases too.

Cade Roux