views:

462

answers:

6

As I move through the iterations on my application*(s) I accumulate migrations. As of just now there are 48 such files, spanning about 24 months' activity.

I'm considering taking my current schema.rb and making that the baseline.

I'm also considering deleting (subject to source control, of course) the existing migrations and creating a nice shiny new single migration from my my current schema? Migrations tend to like symbols, but rake db:schema:dump uses strings: should I care?

Does that seem sensible? If so, at what sort of interval would such an exercise make sense? If not, why not?

And am I missing some (rake?) task that would do this for me?

* In my case, all apps are Rails-based, but anything that uses ActiveRecord migrations would seem to fit the question.

+2  A: 

I think that there are two kinds of migrations:

  • those you made during design/development, because you changed your mind on how your db should be like;

  • those you made between releases, reflecting some behaviour changes.

I get rid of the first kind of migrations as soon as I can, as they do not really represent working releases, and keep the second kind, so that it is possible, in theory, to update the app.

About symbols vs strings: many argue that only strings should be used in migrations: symbols are meant to be "handles" to objects, and should not be used to represent names (column and table names, in this case). This is a mere stylistic consideration, but convinced me, and I'm no more using symbols in migrations.

I've read of another point for using strings: "ruby symbols are memory leaks", meaning that, when you create a symbol, it never gets disposed for all the application life time. This seems quite pointless to me, as all your db columns will be used as symbols in a Rails (and ActiveRecord) app; the migrating task, also, will not last forever, so I don't think that this point actually makes sense.

giorgian
Just a random note regarding symbols/strings:If a symbol is used 10 times, it only takes up the memory once. When a string is used (assuming a literal string) it takes up the memory required every time it's in the code. So the symbol might never get GC'd, there's only ever one of it around.
Daniel Huckstep
A: 

You shouldn't be deleting migrations. Why create the extra work?

Migrations essentially are a set of instructions that define how to build the database to support your application. As you build your application the migrations record the iterative changes you make to the database.

IMHO by resetting the baseline periodically you are making changes that have the potential to introduce bugs/issues with your application, creating extra work.

In the case where a column is mistakenly added and then needs to be removed sometime later, just create a new migration to remove extra column. My main reason for this is that when working in a team you don't want your colleagues to have to keep rebuilding their databases from scratch. With this simple approach you (and they) can carry on working in an iterative manner.

As an aside - when building a new database from scratch (without any data) migrations tend to run very quickly. A project I am currently working on has 177 migrations, this causes no problems when building a new database.

dmcnally
+1  A: 

Although I'm sure everyone has their own practices, there's a few rules implied by the way the migration system works:

  • Never commit changes to migrations that may have been used by other developers or previous deployments. Instead, make an additional migration to adjust things as required.
  • Never put model-level dependencies in a migration. The model may be renamed or deleted at some point in the future and this would prevent the migration. Keep the migration as self-contained as possible, even if that means it's quite simplistic and low-level.

Of course there are exceptions. For example, if a migration doesn't work, for whatever reason, a patch may be required to bring it up to date. Even then, though, the nature of the changes effected by the migration shouldn't change, though the implementation of them may.

Any mature Rails project will likely have around 200 to 1000 migrations. In my experience it is unusual to see a project with less than 30 except in the planning stages. Each model, after all, typically needs its own migration file.

Collapsing multiple migrations into a single one is a bad habit to get into when working on an evolving piece of software. You probably don't collapse your source control history, so why worry about database schema history?

The only occasion I can see it as being reasonably practical is if you're forking an old project to create a new version or spin-off and don't want to have to carry forward with an extraordinary number of migrations.

tadman
+4  A: 

Yes, this makes sense. There is a practice of consolidating migrations. To do this, simply copy the current schema into a migration, and delete all the earlier migrations. Then you have fewer files to manage, and the tests can run faster. You need to be careful doing this, especially if you have migrations running automatically on production. I generally replace a migration that I know everyone has run with the new schema one.

Other people have slightly different ways to do this.

I generally haven't done this until we had over 100 migrations, but we can hit this after a few months of development. As the project matures, though, migrations come less and less often, so you may not have to do it again.

This does go against a best practice: Once you check in a migration to source control, don't alter it. I make a rare exception if there is a bug in one, but this is quite rare (1 in 100 maybe). The reason is that once they are out in the wild, some people may have run them. They are recorded as being completed in the db. If you change them and check in a new version, other people will not get the benefit of the change. You can ask people to roll back certain changes, and re-run them, but that defeats the purpose of the automation. Done often, it becomes a mess. It's better left alone.

ndp
So, for example, once I've run a migration into production, why would I want to keep running the old migrations? To set up a new DB, for a new dev/test environment, say, I should just db:schema:load to the baseline and then run new migrations. That seems sensible. I'll always have the original migrations in older source versions, after all.
Mike Woodhouse
Our automated test environment does a db:reset and runs all migrations before each set of tests, and that's really the only time the migrations are run through fully, where redoing migrations is that helpful. People clean them up just to reduce the number of files. Certainly the db:schema:load would work... I think the idea behind it is that you get some testing of the migrations as part of the continuous integration.
ndp
+1  A: 

The top of schema.rb declares:

# This file is auto-generated from the current state of the database. Instead of editing this file, 
# please use the migrations feature of Active Record to incrementally modify your database, and
# then regenerate this schema definition.
#
# Note that this schema.rb definition is the authoritative source for your database schema. If you need
# to create the application database on another system, you should be using db:schema:load, not running
# all the migrations from scratch. The latter is a flawed and unsustainable approach (the more migrations
# you'll amass, the slower it'll run and the greater likelihood for issues).
#
# It's strongly recommended to check this file into your version control system.

I must endorse what [giorgian] said above about different migrations for different purposes. I recommend cleaning up development-oriented migrations along with other tasks you do when you branch for a release. That works for well for me, for myself and small teams. Of course my main app sits atop and between two other databases with their own schemas which I have to be careful of so we use migrations (rather than schema restore) for a new install and those need to survive release engineering.

adric
+1  A: 

Having lots of migrations are a good thing. Combined with your version control system, they allow you to see what developer made a change to the database and why. This helps with accountability. Removing them just makes this a big hassle.

If you really want to get a new database up and running quickly you can just load the schema with rake db:schema:load RAILS_ENV=your_environment and if you want to get your test database setup quick you can just use rake db:test:prepare

That being said, if you really want to consolidate your migrations then I'd create a new migration that checks to see if the very last migration in your set has been performed (ex: does the column you added exist?) and if not, then it will fire. Otherwise the migration will just add itself to the schema table as completed so it doesn't attempt to fire again.

Just communicate what you're doing to the rest of your team so that they understand what is going on lest they blindly fire off a rake db:migrate and screw up something they already had.

PatrickTulskie