views:

160

answers:

3

A question regarding a DB development project. The database already exist and is rather large (several TBs).

  1. What do you use for version control in DB development?
  2. How do you control concurrent changes to the data model by different teams
  3. What is your approach to the Unit Testing in the DB development
  4. How do you deal with the sensitive data if the DB owners do not know what is sensitive? What is your approach to the data obfuscation? What are your obfuscation techniques?
  5. How do you work on a large DB from several locations?

Please answer one or more of the items as you see fit. Each answer will be reviewed separately. Thank you very much!

EDIT: A related question with good answers to the p.1 is here: http://stackoverflow.com/questions/175451/how-do-you-version-your-database-schema

+2  A: 

For 4, "How do you deal with the sensitive data if the DB owners do not know what is sensitive? What is your approach to the data obfuscation?"

"Sensitive until proven innocuous" is my mantra. Unless someone makes a case for not adequately protecting any data from visibility (either internal or external) then my default mode is to protect it.

Cases come up later on where we'll open data up for perfromance, reporting, etc reasons, but a documented business case with the appropriate signatures is required.

Adam Davis
+2  A: 

For most of these, while the tools don't apply the general processes of code development do:

  • Maintain a development system separate from production with enough data to get useful performance metrics when testing a new model
  • This system has unit tests (SQL queries, commits, aborted atomic commits, etc) written and run against it prior to every release.
  • There are official 'releases'
  • The development database is the source control system itself - in other words the database is modeled and held in the database with sign-ins and rollbacks, etc. It's non-trivial, and doesn't solve every problem, but given the lack of good VCS for databases it works.
  • Roll-outs (after testing, integration, etc) consist of just the new database structure going to the production site - the modeling tables are not replicated there.
Adam Davis
A: 

I suggest you have a look at neXtep designer. It allows you to put your database under version control in a dedicated VCS repository. You then work in this repository with checkouts, locks, commits, etc. The product allows you to automatically generate the SQL deltas from version differences and can generate delivery packages for you. It is an integrated development tool where you have data model diagrams, powerful SQL editors (hyperlinking, documentation hovers, in-text search, auto-completion, etc.), SQL clients, synchronization engines (to synch the repository to or from an existing database). The repository is designed for team work. You have different strategies : each developer can have its own workspace and development branch, in which case you would merge all developments to the main branch. Or you can make them all work on the same workspace as the product handles per-user locks when elements are modified.

The product is based on eclipse RCP and is provided freely under the GPL3 license. All information and concepts introduction are available in the wiki. Visit : http://www.nextep-softwares.com

And tell me what you think.

Christophe Fondacci