views:

333

answers:

4
+1  Q: 

Migration to GAE

What is the best way to migrate MySQL tables to Google Datastore and create python models for them?

I have a PHP+MySQL project that I want to migrate to Python+GAE project. So far the big obstacle is migrating the tables and creating corresponding models. Each table is about 110 columns wide. Creating a model for the table manually is a bit tedious, let alone creating a loader and importing a generated csv table representation.

Is there a more efficient way for me to do the migration?

+1  A: 

In your shoes, I'd write a one-shot Python script to read the existing MySQL schema (with MySQLdb), generating a models.py to match (then do some manual checks and edits on the generated code, just in case). That's assuming that a data model with "about 110" properties per entity is something you're happy with and want to preserve, of course; it might be worth to take the opportunity to break things up a bit (indeed you may have to if your current approach also relies on joins or other SQL features GAE doesn't give you), but that of course requires more manual work.

Once the data model is in place, bulk loading can happen, typically via intermediate CSV files (there are several ways you can generate those).

Alex Martelli
That's what I am resolving to do. I was wonder if anyone created such a tool already.
notnoop
@msaeed, AFAIK, there are as yet no such "migration aids" publically available. Most projects using a relational DB will need lot of manual work to carefully denormalize the schema to avoid joins, etc, etc, so the mere mechanical transcription of schema to models isn't really all that useful...:-(
Alex Martelli
@Alex, thanks. (Un-)Fortunately for me, my schemas are already denormalized.
notnoop
+3  A: 

In general, generating your models automatically shouldn't be too difficult. Suppose you have a csv file for each table, with lines consisting of (field name, data type), then something like this would do the job:

# Maps MySQL types to Datastore property classes
type_map = {
    'char': 'StringProperty',
    'text': 'TextProperty',
    'int': 'IntegerProperty',
    # ...
}

def generate_model_class(classname, definition_file):
  ret = []
  ret.append("class %s(db.Model):" % (classname,))
  for fieldname, type in csv.reader(open(definition_file)):
    ret.append("  %s = db.%s()" % (fieldname, type_map[type]))
  return "\n".join(ret)

Once you've defined your schema, you can bulk load directly from the DB - no need for intermediate CSV files. See my blog post on the subject.

Nick Johnson
+1  A: 

approcket can mysql⇌gae or gae builtin remote api from google

LarsOn
While the best so far, `approcket` is buggy and not well documented. It also doesn't generate models.
notnoop
+1  A: 

You could migrate them to django models first

In particular use

python manage.py inspectdb > models.py

And edit models.py until satisfied. You might have to put ForeignKeys in, adjusts the length of CharFields etc.

I've converted serveral legacy databases to django like this with good success.

Django models however are different to GAE models (which I'm not very familiar with) so that may not be terribly helpful I don't know!

Nick Craig-Wood