views:

1238

answers:

5

I have fixtures with initial data that needs to reside in my database (countries, regions, carriers, etc.). I have a task rake db:seed that will seed a database.

namespace :db do
  desc "Load seed fixtures (from db/fixtures) into the current environment's database." 
  task :seed => :environment do
    require 'active_record/fixtures'

    Dir.glob(RAILS_ROOT + '/db/fixtures/yamls/*.yml').each do |file|
      Fixtures.create_fixtures('db/fixtures/yamls', File.basename(file, '.*'))
    end
  end
end

I am a bit worried because this task wipes my database clean and loads the initial data. The fact that this is even possible to do more than once on production scares the crap out of me. Is this normal and do I just have to be cautious? Or do people usually protect a task like this in some way?

Thanks!

+1  A: 

Rails 3 will solve this for you using a seed.rb file.

http://github.com/brynary/rails/commit/4932f7b38f72104819022abca0c952ba6f9888cb

Jarrod
will that be restricted to being loaded when the database is empty? if not, why wouldn't i still be able to accidentally seed twice destroying live data? very happy this is finally becoming a convention!
Tony
You'll populate it with Ruby code just like if you wrote seed data into your migrations. My guess is that it will just execute whatever code you put into the seed.rb file, so it won't wipe the database, but add to it (or update, depending on your code).
Jarrod
Hopefully it has an "ignore duplicate entries" option so that I can seed twice in "append" mode and not have duplicate data...and since i am in "append" mode, no worries about wiping the db clean
Tony
+1  A: 

How about just deleting the task off your production server once you have seeded the database?

srboisvert
yea that is a good call but then I have to keep it out of the repo or the file returns every time i deploy...or I could make capistrano delete the file after it deploys...i think that is better.
Tony
+6  A: 

Seeding data with fixtures is an extremely bad idea.

Fixtures are not validated and since most Rails developers don't use database constraints this means you can easily get invalid or incomplete data inserted into your production database.

Fixtures also set strange primary key ids by default, which is not necessarily a problem but is annoying to work with.

There are a lot of solutions for this. My personal favorite is a rake task that runs a Ruby script that simply uses ActiveRecord to insert records. This is what Rails 3 will do with db:seed, but you can easily write this yourself.

I complement this with a method I add to ActiveRecord::Base called create_or_update. Using this I can run the seed script multiple times, updating old records instead of throwing an exception.

I wrote an article about these techniques a while back called Loading seed data.

Luke Francl
i definitely like doing it through active record but i thought you couldn't set the primary keys that way. i'll have to give the method from your article a shot.
Tony
what if you are seeding data that does not have a model associated with it?
Tony
If it doesn't have a model, does it mean you're not storing the data in the database? In that case I'd use constants. You could set those in an initializer.
Luke Francl
+2  A: 

For the first part of your question, yes I'd just put some precaution for running a task like this in production. I put a protection like this in my bootstrapping/seeding task:

task :exit_or_continue_in_production? do
  if Rails.env.production?
    puts "!!!WARNING!!! This task will DESTROY " +
         "your production database and RESET all " +
         "application settings"
    puts "Continue? y/n"
    continue = STDIN.gets.chomp
    unless continue == 'y'
      puts "Exiting..."
      exit! 
    end
  end
end

I have created this gist for some context.

For the second part of the question -- usually you really want two things: a) very easily seeding the database and setting up the application for development, and b) bootstrapping the application on production server (like: inserting admin user, creating folders application depends on, etc).

I'd use fixtures for seeding in development -- everyone from the team then sees the same data in the app and what's in app is consistent with what's in tests. (Usually I wrap rake app:bootstrap, rake app:seed rake gems:install, etc into rake app:install so everyone can work on the app by just cloning the repo and running this one task.)

I'd however never use fixtures for seeding/bootstrapping on production server. Rails' db/seed.rb is really fine for this task, but you can of course put the same logic in your own rake app:seed task, like others pointed out.

karmi
A: 

I just had an interesting idea...

what if you created \db\seeds\ and added migration-style files:

file: 200907301234_add_us_states.rb

class AddUsStates < ActiveRecord::Seeds

  def up
    add_to(:states, [
      {:name => 'Wisconsin', :abbreviation => 'WI', :flower => 'someflower'},
      {:name => 'Louisiana', :abbreviation => 'LA', :flower => 'cypress tree'}
      ]
    end
  end

  def down
    remove_from(:states).based_on(:name).with_values('Wisconsin', 'Louisiana', ...)
  end
end

alternately:

  def up
    State.create!( :name => ... )
  end

This would allow you to run migrations and seeds in an order that would allow them to coexist more peaceably.

thoughts?