views:

35

answers:

2

How do you split up your data for an application based on a database? A lot of software will have tables named something like Auth_Users, Auth_Whatever, Auth_Something, and then Admin_Something, Admin_Whatever, etc. Essentially the tables all exist in one database but they are organized by a naming convention. The other option would be to split some of this data into several databases. What are the pros and cons of doing one or the other?

A: 

There's a third option that you haven't considered: Users-Groups-Roles. Make these three relationships in a single credentials relational database or LDAP.

Assign credentials to a particular Role; give a Group a Role; add Users to Groups.

I wouldn't split these into separate databases. Too scattered makes maintenance is more difficult in my opinion.

duffymo
+1  A: 

When your data is split up as you suggested, you start having to jump through hoops to get at it, and to be able to use it in queries or applications. You also have increased maintenance concerns such as permissions, synchronization, backups, enforcing business rules, etc. One instance where you'll likely run into this scenario is when there are multiple off-the-shelf applications already in use and you need to relate the information in all of them.

I'm sure there are other instances where it's simply not feasible to have everything together in one place. In those cases, creating a new database for the express purpose of linking to data in multiple other sources by using views, stored procedures, functions, etc, can help make the data appear integrated when it isn't. But it will be more work, and you won't have a lot of control over integrity features, resulting in flimsy linkage schemes to your other data sources. "We link on first name and last name, but we get all kinds of false positives because DataEntry Dude fat-fingered the entry a dozen times, so we have multiple mispelled copies that appear to sort of maybe match". You don't want any more of that scenario than you have to have.

In my opinion, a good rule of thumb is to try and keep related data (or data you think may at some point be related) in one database as much as possible, and spend a lot of time thinking about appropriate naming schemes, relationships, and keys/indexes. An exception to this may be data that's very generic, like zip code data the gov't provides from census results. Data like that may end up being used in all sorts of other databases you have, but it's not really related to any of them. In that case, unless you're lucky enough to have just one database, I'd create another database to put all the oddball data into, and then link to it from the other databases that need the info. This isn't always ideal either, though, hence some people characterizing database design as an art as much as a science.

There are always curve-balls that have to be handled, but just keep the mindset that you care about the data your handling, which means always thinking about how to maintain the integrity of the information (i.e. business rules, relationships, keeping duplicates out, keep redundancy down, etc).

pheadbaq