tags:

views:

27

answers:

3

I'm creating a set of SQL full database copy scripts using MySQL's INTO OUTFILE and LOAD DATA LOCAL INFILE.

Specifically:

SELECT {columns} FROM {table} INTO OUTFILE '{table}.csv'

LOAD DATA LOCAL INFILE '{table}.csv' REPLACE INTO {table} {columns}

Because of this, I don't need just the tables, I also need the columns for the tables.

I can get all of the tables and columns, but this doesn't include m2m tables:

from django.db.models import get_models()
for model in get_models():
    table = model._meta.db_table
    columns = [field.column for field in model._meta.fields]

I can also get all of the tables, but this doesn't give me access to the columns:

from django.db import connection
tables = connection.introspection.table_names()

How do you get every table and every corresponding column on that table for a Django project?

More details:

I'm doing this on a reasonably large dataset (>1GB) so using the flat file method seems to be the only reasonable way to make this large of a copy in MySQL. I already have the schema copied over (using ./manage.py syncdb --migrate) and the issue I'm having is specifically with copying the data, which requires me to have the tables and columns to create proper SQL statements. Also, the reason I can't use default column ordering is because the production database I'm copying from has different column ordering than what is created with a fresh syncdb (due to many months worth of migrations and schema changes).

+2  A: 

Have you taken a look at manage.py ?

You can get boatloads of SQL information, for example to get all the create table syntax for an app within your project you can do:

python manage.py sqlall <appname>

If you type:

python manage.py help

You can see a ton of other features.

Bartek
and if you need to do this within a script, you can use `management.call_command('sqlall')` etc.
Daniel Roseman
I'm using manage.py syncdb --migrate to do the actual schema copy. I actually started out using sqlall to do the schema migration. The problem there was that we use South with some migrations that modify the schema slightly from what normal syncdb produces, specifically e make the auth_user.username field longer. Kind of a hack, I know.The problem here is that I need to create flat files of the actual data (>1GB) so that I can stream it over a network connection. To create those flat files, I think I need to know the table and column information. Thanks for the suggestion though.
Wes Winham
A: 

Have you looked into "manage.py dumpdata" and "manage.py loaddata"? They dump and load in json format. I use it to dump stuff from one site and overwrite another site's database. It doesn't have an "every database" option on dumpdata, but you can call it in a loop on the results of a "manage.py dbshell" command.

eruciform
We love dumpdata/loaddata for testing, but it's not the greatest when it comes to using MySQL with InnoDB (we actually had to patch Django so that our fixtures will load for testing). loaddata is just way too slow, unfortunately, when it comes to loading a >1GB database, especially over a network. Thanks for the suggestion though.
Wes Winham
A: 

I dug in to the source to find this solution. I feel like there's probably a better way, but this does the trick.

This first block gets all of the normal (non-m2m) tables and their columns

from django.db import connection

table_info = []
tables = connection.introspection.table_names()
seen_models = connection.introspection.installed_models(tables)
for model in seen_models:
    table = model._meta.db_table
    columns = [field.column for field in model._meta.fields]
    table_info.append((table, columns))

This next block was the tricky part. It gets all the m2m field tables and their columns.

for model in seen_models:
    for field in model._meta.local_many_to_many:
        if not field.creates_table:
            continue

        table = field.m2m_db_table()
        columns = ['id'] # They always have an id column
        columns.append(field.m2m_column_name())
        columns.append(field.m2m_reverse_name())
        table_info.append((table, columns))
Wes Winham