views:

116

answers:

1

We have a Rails project that uses PostgreSQL-specific queries, so it's uncomfortable to use sqlite even in development mode. The problem is, I'd love to track schema changes in database and omit running migrations on request, and I'd also love to track db-data changes with git, so that I wouldn't need to dump the db and load it on my machine. So basically I only want to do 'git pull' and see the application working with the new schema and data.

What are the possible strategies here? The only that comes to my mind is to use a simple wrapper that takes an sql-query, checks if it has any db-specific parts and rewrites it for development environment, so that we could still use sqlite. What else?

+1  A: 

I'm not sure I understand all the nuances of your question - particularly the comments about using SQLite vs PostgreSQL. If it is to be a multi-DBMS system, then testing with multiple systems is good; if it is to be a single-DBMS system, then working with multiple DBMS is making life pointlessly hard.

Also, you talk about tracking the schema changes in the database...is this storing the information about schema changes separately from DBMS's own system catalog, or do you really mean that you want to track database schema changes (using something outside the database - such as a VCS)?

You also talk about tracking 'DB-data changes' which I take to mean 'the data in the tables in the database'. Again, I'm not clear if you are thinking of some sort of dump of the data from the database that covers the differences between what was there, say, a day ago and what is there now, or something else.

These issues might be why you didn't get a response for over 4 hours.

When you talk about a 'simple wrapper', you are not talking about something that I'd call simple. It has to parse arbitrary SQL, work out whether any of it is DBMS-specific, and then apply rewrite rules. That is a non-trivial undertaking. Getting the wrapper called in the right places could be non-trivial too - it depends on the set of APIs you are using to access the DBMS, amongst other things.

What else?

  • Use the same DBMS in both production and development?
  • Tracking just schema changes is non-trivial. You need to track the essence of the schema (such as table name, column names, etc) and not the accidence (yeah, I was rereading Brooks' "No Silver Bullet" earlier) such as the TabID (which might vary without the schema being materially different). However, an analysis would tell you whether the schema is different.
  • Tracking the data changes, independent of schema changes, is also non-trivial. In general, the volume of such data is large. You may be able to deal with a full archive or a full unload or export of the database - but ensuring that the data is presented in the same sequence each time may require some care on your part. If you don't ensure the correct sequencing, the VCS will be recording huge changes due to ordering differences.

All the above amounts to the dreaded "it depends" answer. It depends on:

  • Your DBMS
  • Your database size
  • The volatility of your schema
  • The volatility of your data

It only marginally depends on your VCS or platform, fortunately.

Jonathan Leffler