views:

156

answers:

4

I created two applications that were essentially identical on heroku. They started off different because I was testing uploading to heroku and having some challenges making the adjustments.

But now things seem to be working, but both have data that I would like to consolidate. Since they run off the same git repository, the code is the same, as are the migrations.

Seems like I need to bring it down locally and merge, but not exactly clear how to do that. Did some searches on Google and nothing clear.

I'd like some help in terms of a step-by-step, I don't have a clear process.

1) I have two apps on heroku where I have the databases. They have the same schemas;

2) I don't need to know where the data came from: I just need it all to reside in a single database

3) I would like to be able to do it with specific sql commands, versus manually opening (not sure how I would do that) and then munging since there are about 10 different interrelated tables.

Thanks!

+1  A: 

There is not automatic way to do this since there is no way to automate this in a generic fashion (without doing some stuff you would want to do). Therefore, it'll take a few steps, but you can leverage tools all along the way.

You can use Heroku's built-in tools to get a dump of the table. First download and import the data into your database, and then dump it out into a text file (SQL format).

Once you have one of the data sets in SQL as text, you need to edit the file a little. You need to make it an import script instead of a "rebuild the database" script that starts by deleting existing rows (or tables). If you're careful, it may already be in the right format, but likely something will be off.

There are a few gotchas you can run into:

  • If you have generated keys for records-- which you probably do-- then you'll have to renumber them in the data set you are importing. There may be a way to export them without generated keys, but what I have done is use a quick grep to renumber them outside of the range of the database I'm merging into.
  • If there are references to theses keys in other tables (as foreign keys), you'll have to renumber there as well.
  • Some tables may be "reference tables", and the same on both systems, so you can skip importing them.
  • Some tables may not need to be merged.

Once you have the text file in good shape, run it locally and test it. If it messes things up, don't worry-- just download the production data (the one you're importing into), and try again. Iterate until you have everything working well locally. Then, upload the file to heroku.

I know it sounds like a few steps-- and it is. There are no tricky problems to solve, though. You just need to go slowly and carefully. Get someone to pair with you on it to help you think it through.

ndp
thanks...so the local database in sqlite3....which I'm not sure how to open, it looks like just gibberish. So I'm not sure exactly how to do this merge -- no tools exist?
Angela
You can runsqlite3 /path/to/databaseto be able to execute commands on it.
David Dollar
I see... I guess I'm looking for the specific commands to actually merge the files together, including some of the things like changing the keys referenced? Or can it be opened in a simple flat file?
Angela
You can probably export each of the sqlite3 tables into CSV or similar and merge that, good luck
Tom
How do I do that?
Angela
Instead of importing the data into you local DB, create a bundle from your heroku app and download the bundle. Inside the bundle (its a TAR-file), you will find the SQL-Dump. See here: http://docs.heroku.com/backups
Sven Koschnicke
+1  A: 

Assuming you don't need to eliminate duplicates, you can do this for each table

insert into db1.tablea
select * from db2.tablea ;

Some complications:

  • if the tables have id columns, you need to make sure they don't clash, by replacing old ids with new ids
  • but, since the ids are the keys that link the tables, you need to make sure that new ids match in each table.

Here's a quick and dirty way to do it:

  • Find the highest id in any table in the first database.
  • Call this max_key_db1.
  • Then update all keys in the second database to be current_value plus max_key_db1.

Note that you'll need to update both primary keys and foreign keys for this to work, e.g.:

update db2.tablea set id = id + max_key_db1, foreign_id = foreign_id + max_key_db1;
update db2.tableb set id = id + max_key_db1, a_id = a_id + max_key_db1;
etc.

Now you have a self-consistent db2 with all keys (primary and foreign) with values that don't exist in db1; in other words, you keys are unique across both databases.

Now you can insert the rows from db2 into db1:

insert into db1.tablea
select * from db2.tablea ;

Note this won't work if the tables inserted into create their own ids using auto-increment or triggers; in this case you'll have to specify the ciolumns explicitly and turn off any auto-generated ids:

insert into db1.tablea( id, foreign_id, col1, ...)
select id, foreign_id, col1 from db2.tablea ;

Alternately, you can leave db2 unaltered, by doing this all in one step for each table:

insert into db1.tablea( id, foreign_id, col3, col4)
select id + max_key_db1, foreign_id + max_key_db1, col3, col4 from db2.tablea ;

Of course, do this all inside a transaction, and don't commit until you're sure you've gotten every table and all are correct. And do this on copies of your databases.

Now, since you used the highest key in db1 regardless of table, likely your ids won't be consecutive, but who cares? Keys are keys. What you will need to do is reset any auto_increment or sequence, for each table, so that teh next auto-generated key is higher than the highest key in that table. How you do that depends on what RDBMS you're using.

tpdi
Sory for asking a noob question, but sqlite3 look like a flat file, where do I actually issue these commands?
Angela
Run (from the command line) sqlite on the database: sqlite3 database-name
tpdi
So is this a step by step....how would you suggest I do the copies? I may need a little more step by step since I'm not exaclty sure sure how to do this...I've added a a bounty :)
Angela
A: 

Going to close this -- decided to just manually select the right data and re-enter it so I can do some error checking -- a pain but this approach doesn't seem to have an easy answer. Note to self: keep all production data in production versus test-driving.

Angela
A: 

If you only need to do this once, you could get it done easily using ms access.

You can work out any conflict by creating some query in the visual query designer.

You can connect to an sqlite3 database by using the odbc driver for sqllite3 and link those tables in access.

Joel Gauvreau