views:

388

answers:

2

I've used bulkloader.Loader to load stuff into the GAE dev and live datastore, but my next thing to to create objects from non-CSV data and push it into the datastore.

So say my object is something like:

class CainEvent(db.Model):   
    name =db.StringProperty(required=True)  
    birthdate = db.DateProperty()

Can anyone give me a simple example on how to do this please?

+2  A: 

Here's an extremely simplified example of what we're doing to use the bulkloader to load JSON data instead of CSV data:

class JSONLoader(bulkloader.Loader):
    def generate_records(self, filename):
        for item in json.load(open(filename)):
            yield item['fields']

In this example, I'm assuming a JSON format that looks something like

[
    {
        "fields": [
            "a", 
            "b", 
            "c", 
            "d"
        ]
    }, 
    {
        "fields": [
            "e", 
            "f", 
            "g", 
            "h"
        ]
    }
]

which is oversimplified.

Basically, all you have to do is create a subclass of bulkloader.Loader and implement (at a minimum) the generate_records method, which should yield lists of strings. This same strategy would work for loading data from XML files or ROT13-encrypted files or whatever.

Note that the list of strings yielded by the generate_records method must match up (in length and order) with the "properties" list you provide when you initialize the loader (ie, the second argument to the AlbumLoader.__init__ method in this example).

This approach actually provides a lot of flexibility: We're overriding the __init__ method on our JSONLoader implementation and automatically determining the kind of model we're loading and its list of properties to provide to the bulkloader.Loader parent class.

Will McCutchen
A: 

You may find this post useful - it details how to load data direct from an RDBMS, but applies equally to loading from any other source.

Nick Johnson