views:

1261

answers:

4

I have about 4000 records that i need to upload to the datastore. They are currently in CSV format. I'd appreciate if someone would point me to or explain how to upload data in bulk to GAE.

Thank you very much. Help appreciated.

+3  A: 

You can use the bulkloader.py tool:

The bulkloader.py tool included with the Python SDK can upload data to your application's datastore. With just a little bit of set-up, you can create new datastore entities from CSV files.

waqas
With a little extra work, you can even load data direct from an SQL database, or any other data source.
Nick Johnson
A: 

I don't have the perfect solution, but I suggest you have a go with the App Engine Console. App Engine Console is a free plugin that lets you run an interactive Python interpreter in your production environment. It's helpful for one-off data manipulation (such as initial data imports) for several reasons:

  1. It's the good old read-eval-print interpreter. You can do things one at a time instead of having to write the perfect import code all at once and running it in batch.
  2. You have interactive access to your own data model, so you can read/update/delete objects from the data store.
  3. You have interactive access to the URL Fetch API, so you can pull data down piece by piece.

I suggest something like the following:

  1. Get your data model working in your development environment
  2. Split your CSV records into chunks of under 1,000. Publish them somewhere like Amazon S3 or any other URL.
  3. Install App Engine Console in your project and push it up to production
  4. Log in to the console. (Only admins can use the console so you should be safe. You can even configure it to return HTTP 404 to "cloak" from unauthorized users.)
  5. For each chunk of your CSV:
    1. Use URLFetch to pull down a chunk of data
    2. Use the built-in csv module to chop up your data until you have a list of useful data structures (most likely a list of lists or something like that)
    3. Write a for loop, iterating through each each data structure in the list:
      1. Create a data object with all correct properties
      2. put() it into the data store

You should find that after one iteration through #5, then you can either copy and paste, or else write simple functions to speed up your import task. Also, with fetching and processing your data in steps 5.1 and 5.2, you can take your time until you are sure that you have it perfect.

(Note, App Engine Console currently works best with Firefox.)

jhs
A: 

the later version of app engine sdk, one can upload using the appcfg.py

see appcfg.py

rbawaskar
A: 

I'm having a problem just like the original poster. I'm using the Eclipse Plugin for Java. I've been trying to solve it by running my application on my local machine. That creates a local_db.bin. I've been trying to find ways to upload the contents of local_db.bin. The only lead I have is to use the bulkuploader.py. The big problem is that's for the python Google app engine. Since I'm using the java Google app endine, I'm not really sure bulkuploader.py is going to allow me to communicate with my hosted app that was created in Java.

Michael