views:

343

answers:

2

We are just giving MongoDB a test run and have set up a Rails 3 app with Mongoid. What are the best practices for inserting large datasets into MongoDB? To flesh out a scenario: Say, I have a book model and want to import several million records from a CSV file.

I suppose this needs to be done in the console, so this may possibly not be a Ruby-specific question.

Edited to add: I assume it makes a huge difference whether the imported data includes associations or is supposed to go into one model only. Any comments on either scenario welcome.

A: 

If you want add this dataset only one time. You can use the db/seed.rb file. you can read your CSV and generate all Document.

If you want made that a lot of times, you can made a runner or task.

With task, you need define a lib/task/file.rake and generate task with your file and again parse it and generate all documents.

You can made a runner too.

It's the same thing that a ActiveRecord stuff.

shingara
+1  A: 

MongoDB comes with import/export tools that parse JSON formatted data.

Assuming you have an existing database in SQL, the easiest way to migrate that data is to output your SQL data as JSON strings, then use the import tool for each collection.

This includes denormalization and nesting/embedding - so don't migrate a relational model to MongoDB, you should consider also refactoring your data model to leverage MongoDB features.

For example, a common task is to merge articles and tags to an articles collection with the tags embedded as an array. Do that in your export script, so all MongoDB sees is nice clean JSON coming in through the import :-)

You can still import all your tables as collections, but you're missing out on some of the true strengths of MongoDB by doing that.

spacemonkey
Having just embarked on this, am still wrapping my head around MongoDB. Will probably have more specific questions in the near future :-) Thanks.
hakanensari