views:

358

answers:

4

I have an existing django web app that is in use. I have to radically migrate one key model in my design to a completely new design, but I want to cache all of the existing data for that model and migrate them to the new records in production when ready to deploy.

I can afford to bring my website down for a few hours one night and do whatever I need to do to migrate. What are some sane ways I can do this migration?

It seems any migration would need to: 1) Dump all of the existing data into some format, such as SQL, JSON, XML 2) Migrate the model to the new format 3) Reload the data into the new model using a conversion script

I also thought of trying to store all of the existing data in some other model called "OldModel" (if Model is the name of the existing model) and then migrating the data live.

+4  A: 

There is a project to help with migrations that I've heard of: South.

Having said that, I admit we've not used it. We still plan our migrations using a file of SQL statements. Madness, I know, but it has the advantage of testability. You can run it as many times as necessary during development and staging testing before the "big deploy". It can be source controlled, diffed, etc. It can also, therefore, be called from a larger deployment script. Of course, we back up production before running it :-)

If your database does journaling, using the old-fashioned method has the added advantage that there is a transaction history that can be rolled back.

Experiments we've run with JSON, XML and "OldModel" -> "NewModel" style dumps have scaled pretty poorly. Mind you, YMMV... we have quite a large database. By using a script, you can run on your production database without having to offload or reload vast amounts of data. This way even a complicated migration can take seconds, rather than hours.

Jarret Hardie
If that approach is madness, then I'm crazy too!
Matthew Christensen
i'm using south for just such a migration as we speak. i'm migrating an already existing RealEstateListings app to a more general Listings app/model. south has been working fine so far, excepting for some issues migrating many 2 many fields.
Rasiel
+1  A: 

If you are more comfortable with the Django ORM than with raw SQL, you might consider using Model -> BackupModel -> TestModel -> Model, where all but the last step can be performed without dropping data.

def backup(InModel,OutModel):
    in_objs = InModel.objects.all()
    for obj in in_objs:
        out_obj = OutModel.convert_from(InModel,obj)
        out_obj.save()

Here, you would just make sure that all your models have convert_from methods implemented. These should all be trivial conversions except for BackupModel -> TestModel. In the other cases, nothing but the class would change, all data being identically preserved.

The advantage to this is that before you go rewriting all your interfaces, you can play around with TestModel and make sure that your conversions were what you thought they'd be. If everything goes wrong, you convert from BackupModel->Model, and everything is okay. In a worst-case scenario, you give up on Django's ORM, run back to SQL, and simply rename all your tables that begin with backupmodel__* to model__* in your database.

Disclaimer: I've never done this.

David Berger
+1  A: 

There are around 5 or 6 tools to help automate some portion of migrations. Several of them are listed in this question and I'll add the others just for completeness.

Next, see S. Lott's answer to this question about migration workflows for a great idea on using version numbers in the model name to make migrations easier, including structuring a standalone script to properly convert the tables. To my mind this is vastly superior to serializing the data for export and then trying to build your new tables by importing.

Finally, I haven't been able to think of a way to do a hot migration properly and haven't seen any hints from anywhere else either, so maintenance downtime is inevitable.

Van Gale
+1  A: 

Make all migrations in steps!

If you need to add a field, go ahead and add it, with a default value or being optional. This is safe. If you need to make an existing optional field required, give it a default first. If you need to make an existing field with a default not have a default, drop the default after fixing all the code that creates instances. If you need to change the type of a field, add a new field that inherits the value from the current one, first. Then, run a script to update the existing instances to populate the new field. Thirdly, Remove all the code that uses the old field to use the new one. Finally, which no code is left using the original, you can drop it.

For every situation there is a small step you can make. For every bigger change, you can break it down into little ones. This is one place iterative development pays off. Keep good backups in place and don't be afraid to push often! Make the small changes quickly to see if they work.

ironfroggy
This is even easier with django evolution.
David Berger