I work on a project that uses a reasonably large database, the live version weighing in at somewhere around 60-80GB. The live database is the only real definitive source of our schema, and because of its size duplicating this database is too slow to be done often. This means we have ended up developing our database schema in a pretty ad hoc way, using sql compare to migrate changes from dev dbs to the live system, and only wiping our dev dbs every month or two. I am hoping to get some pointers on how to improve our database development work flow so that we have a little more control.
Some things to think about:
- Currently nobody is really in charge of the database schema, all developers can change it if they need to, though generally these decisions are talked about before they are done.
- There are stored procedures, functions, and views in the database. These should probably be dumped to files so they can be reloaded on every build.
- Schema changes should probably be checked in as scripts. We have started to do this recently. However all our scripts must then be numbered (because there may be dependencies between them), and must be re runnable (because our build script currently runs them all in order). This makes them hard to read because they are full of conditionals that check whether tables or columns already exist. This is a step that is often forgotten by developers.
- Getting a new database should be quick and easy. This is currently a big problem, it takes several hours to get a copy of last nights backup and restore it onto a dev machine.
- Some mechanism needs to be in place to allow developers to update static data. We have tables that contain data that is never updated through the application, but does potentially need to be changed when we do a new release (often this drives dropdowns).
- The whole thing needs to be runnable as part of a build script.
Are there any tools that can be used to help to do this? Eventually I would like to be at a point where a new DB can be built from scratch without copying any data from the live system. I don't mind writing some scripts to glue all the steps together but each part should be easily editable so that we continue to use it rather than make changes directly on DBs.