views:

408

answers:

4

I wrote a program that outputs a Python program that fills my Django application with data. This program however is 23 MB large and my computer won't run it. Is there a solution for this?

Another possible solution to fill the database would be using a fixture. The problem is that I don't know the new primary keys yet... or I would have to use the old ones (which I don't prefer).

Any suggestions?

The reason for my path: I'm migrating a database that is very different from the new one, it also has many relations. The 23 MB program remembers all objects from the source database, that's why it's not easily cut in half. Maybe there is a better way to do this? I prefer using Django than using raw SQL.

A: 

Your computer wouldn't run it because is a large program?

Maybe you should reference an external file, or files, with all the structure, and then dump it inside the database, instead of writing it inside your script/software...

Andor
A: 

Put the data in a separate file (or files). Then write a small program that reads in the data and populates your database via Django.

Patrick McElhaney
A: 

If I am reading your post correctly, you are reading from the old database and writing a "python program" based on that data. This seems to me to be the wrong way to do this.

My suggestion would be to create a malleable version of the old database (XML would work well for this) by reading the data from the DB, modifying it as needed, and then dumping it into a file.

With this malleable version of the data, use a separate program to import this data into the new database via your Django Models.

This will also give you a level of flexibility if you ever need to duplicate this process.

Jack M.
+1  A: 

In most cases, you can find a natural hierarchy to your objects. Sometimes there is some kind of "master" and all other objects have foreign key (FK) references to this master and to each other.

In this case, you can use an XML-like structure with each master object "containing" a lot of subsidiary objects. In this case, you insert the master first, and all the children have FK references to an existing object.

In some cases, however, there are relationships that can't be simple FK's to an existing object. In this case you have circular dependencies and you must (1) break this dependency temporarily and (2) recreate the dependency after the objects are loaded.

You do this by (a) defining your model to have an optional FK, (b) and having a temporary "natural key" reference. You'll load data without the proper FK (it's optional).

Then, after your data is loaded, you go back through a second pass and insert all of the missing FK references. Once this is done you can then modify your model to make the FK mandatory.

Program 1 - export from old database to simple flat-file. CSV format or JSON format or something simple.

for m in OldModel.objects.all():
    aDict = { 'col1':m.col1, 'old_at_fk':m.fktoanothertable.id, 'old_id':id }
    csvwriter.writerow( aDict )

Program 2 - read simple flat-file; build new database model objects.

# Pass 1 - raw load

for row in csv.reader:
    new= NewModel.create( **row )

# Pass 2 - resolve FK's

for nm in NewModel.objects.all():
    ref1= OtherModel.objects.get( old_id=nm.old_at_fk )
    nm.properfk = ref1
    nm.save()
S.Lott
using pickle to read and write dictionaries looks like the easiest way
Jack Ha