ansaurus

Question

Can Rails Migrations be used to convert data?

Answer 1

+11 A:

What you're trying to do is possible, and I would say the correct thing to do.

You need, though, to reload the column info for the model classes you're updating in the migration, so that Rails knows about the new columns. Try this:

def.self up
    add_column :users, :age_text, :string

    User.reset_column_information 

    users = User.find(:all)

    users.each do |u|
       u.age_text = convert_to_text(u.age)
       u.save
    end
end

On a separate note, please note that if your table is large, doing updates one by one will take a looong time.. Be careful with that.

Eduardo Scoz 2009-05-11 21:52:49

What would be a better way?

Kirschstein 2009-05-11 22:59:59

Depends on what you're trying to do.. If it's a simple update that needs to be done, you could simply run a SQL command with the execute method (execute 'update user set colA = 1'). Again, just be careful; if you have 10k users in the Users table, looping thru each of them with the migration above will likely take a long time.

Eduardo Scoz 2009-05-12 03:06:30

Answer 2

+3 A:

Since I'm new here I can't comment on the above so I'll add my own answer.

GENERALLY manipulating data in migrations is a BAD idea. Migrations with direct model access can get stuck if the model logic changes.

Imagine in your second migration you've added a new column. You want to seed that column with new data.

Let's also say a few weeks later you add a new validation to the model - a validation that operates on a field that does not yet exist in your second migration. if you ever were to construct the database from migration 0, you'd have some problems.

I strongly suggest using migrations to alter columns and other means to manage database data, especially when moving to production.

Brian Hogan 2009-05-14 15:07:13

Hmm, very good point.With this in mind, how do you handle converting data?

Kirschstein 2009-05-14 16:32:05

I make a rake task. It's easier, and actually easier to test too, even if I throw it away after I use it.

Brian Hogan 2009-05-14 16:41:40

Is there a way to ensure that your rake tasks are run in the correct sequence, along with migrations then?What if a migration made after this one then drops of the obsolete column before the rake task is run?

Kirschstein 2009-05-14 16:51:24

It's an unlikely situation. You generally have your rake tasks run after all your migrations are made, and you generally never re-run migrations on databases with actual data on them. Once you've 'fixed' your data, you don't 'need' to go back to 0 to fix that again. New developers, and new production boxes, are encouraged to create the database with rake db:schema:load instead, which is fast, and immediately builds the most recent db. Data conversions have, for me, been a 'one-off' thing. You convert the data and move forward.

Brian Hogan 2009-05-16 01:30:11

Ah I see. Thanks for your help

Kirschstein 2009-05-16 16:44:33

Thanks guys, this was really helpful. However, I have that 'unlikely situation' where an obsolete column is dropped after a migration like this on above. So I need to migrate data from that field before the dropping happens. Right now, I have that data migration in a migration instead of a rake task. What would you recommend in that case?

Daniel Pietzsch 2009-12-23 01:45:05

If you have to migrate data out of a column you're going to drop, you should write a separate script to do the work, that you can test. It shouldn't be part of the migration in my opinion. Also remember that rake tasks are chainable - you can make one rake task invoke others either directly or as dependencies.

Brian Hogan 2010-01-13 04:52:49

Answer 3

A:

I would say that if you can "undo" the imported data when rolling back the migration version, then it's appropriate to put imports into the migration.

For example, I have a migration which sets up a lot of lookup tables and other meta-data. The data for these tables are populated during this phase. As the data for these lookup tables changes, I create new YAML files storing the meta-data and load those files in subsequent migrations (and un-do those YAMLS, re-loading the previous YAML file when backing out of a migration version). This is pretty clean. I have files (in different well-defined folders in my case) with these files:

002_setup_meta_data.rb
002_meta_data.yaml


007_change_meta_data.rb
007_meta_data.yaml

If you're importing "production" data from another system into transactional (non-static) tables, then I would say using migrations is not appropriate. Then I would follow Brian Hogan's advice of using rake tasks.

science 2009-05-20 17:57:32

ansaurus

tags:

views:

answers:

Can Rails Migrations be used to convert data?

related questions