views:

286

answers:

2

I need to populate my production database app with data in particular tables. This is before anyone ever even touches the application. This data would also be required in development mode as it's required for testing against. Fixtures are normally the way to go for testing data, but what's the "best practice" for Ruby on Rails to ship this data to the live database also upon db creation?

ultimately this is a two part question I suppose.

1) What's the best way to load test data into my database for development, this will be roughly 1,000 items. Is it through a migration or through fixtures? The reason this is a different answer than the question below is that in development, there's certain fields in the tables that I'd like to make random. In production, these fields would all start with the same value of 0.

2) What's the best way to bootstrap a production db with live data I need in it, is this also through a migration or fixture?

I think the answer is to seed as described here: http://lptf.blogspot.com/2009/09/seed-data-in-rails-234.html but I need a way to seed for development and seed for production. Also, why bother using Fixtures if seeding is available? When does one seed and when does one use fixtures?

+3  A: 

Usually fixtures are used to provide your tests with data, not to populate data into your database. You can - and some people have, like the links you point to - use fixtures for this purpose.

Fixtures are OK, but using Ruby gives us some advantages: for example, being able to read from a CSV file and populate records based on that data set. Or reading from a YAML fixture file if you really want to: since your starting with a programming language your options are wide open from there.

My current team tried to use db/seed.rb, and checking RAILS_ENV to load only certain data in certain places.

The annoying thing about db:seed is that it's meant to be a one shot thing: so if you have additional items to add in the middle of development - or when your app has hit production - ... well, you need to take that into consideration (ActiveRecord's find_or_create_by...() method might be your friend here).

We tried the Bootstrapper plugin, which puts a nice DSL over the RAILS_ENV checking, and lets your run only the environment you want. It's pretty nice.

Our needs actually went beyond that - we found we needed database style migrations for our seed data. Right now we are putting normal Ruby scripts into a folder (db/bootstrapdata/) and running these scripts with Arild Shirazi's required gem to load (and thus run) the scripts in this directory.

Now this only gives you part of the database style migrations. It's not hard to go from this to creating something where these data migrations can only be run once (like database migrations).

Your needs might stop at bootstrapper: we have pretty unique needs (developing the system when we only know half the spec, larg-ish Rails team, big data migration from the previous generation of software. Your needs might be simpler).

RyanWilcox
A: 

If you did want to use fixtures the advantage over seed is that you can easily export also.

A quick guess at how the rake task may looks is as follows

  desc 'Export the data objects to Fixtures from data in an existing 
  database.  Defaults to development database.  Set RAILS_ENV to override.'
  task :export => :environment do
    sql  = "SELECT * FROM %s"
    skip_tables = ["schema_info"]
    export_tables = [
      "roles", 
      "roles_users", 
      "roles_utilities",
      "user_filters", 
      "users",
      "utilities"
    ]

    time_now = Time.now.strftime("%Y_%h_%d_%H%M")
    folder = "#{RAILS_ROOT}/db/fixtures/#{time_now}/"
    FileUtils.mkdir_p folder
    puts "Exporting data to #{folder}"

    ActiveRecord::Base.establish_connection(:development)
    export_tables.each do |table_name|
      i = "000"
      File.open("#{folder}/#{table_name}.yml", 'w') do |file|
        data = ActiveRecord::Base.connection.select_all(sql % table_name)
        file.write data.inject({}) { |hash, record|
          hash["#{table_name}_#{i.succ!}"] = record 
          hash }.to_yaml
      end
    end
  end

  desc "Import the models that have YAML files in 
  db/fixture/defaults or from a specified path."
  task :import do
    location = 'db/fixtures/default' 
    puts ""
    puts "enter import path [#{location}]"
    location_in = STDIN.gets.chomp
    location = location_in unless location_in.blank?
    ENV['FIXTURES_PATH'] = location
    puts "Importing data from #{ENV['FIXTURES_PATH']}"
    Rake::Task["db:fixtures:load"].invoke
  end
Will