views:

334

answers:

2

I have a file that contains ~16,000 lines of information on entities. The user is supposed to upload the file using an HTML upload form, then the system handles this by reading line by line and creating then put()'ing entities onto the datastore.

I'm limited by the 30 second request time limit. I have tried a lot of different work-arounds using Task Queue, forced HTML redirecting, etc. and nothing has worked for me.

I am using forced HTML redirecting to delete all data and this works, albeit VERY slowly. (4th answer here: http://stackoverflow.com/questions/108822/delete-all-data-for-a-kind-in-google-app-engine)

I can't seem to apply this to my uploading problem, since my method has to be a POST method. Is there a solution somehow? Sample code would be much appreciated since I'm very new to web development in general.

+2  A: 

To solve a similar problem, I stored the dataset in a model with a single TextProperty, then spawn a taskqueue task that:

  1. Fetches a dataset from the datastore if there are any left.

  2. Checks if the length of the dataset is <= N, where N is some small number of entities you can put() without a timeout. I used 5. If so, write the individual entities, delete the dataset record, and spawn a new copy of the task.

  3. If the dataset size was bigger than N, split it into N parts in the same format and write those to the datastore, delete the original entity, and spawn a new copy of the task.

Wooble
That sounds about right to me.
Will McCutchen
So what does the single TextProperty really contain? I'm currently doing this to get the uploaded file:ftmp=unicode(self.request.get('myfile'),'utf-16')file=ftmp.splitlines() # Now I have a list of all the lines in the file.Do you suggest I store ftmp (which I believe should be a string(?) that contains the contents of the file in one huge chunk) in the single TextProperty?
Jack Low
Also, I get this when I try and put() ftmp into a TextProperty.RequestTooLargeError: The request to API call datastore_v3.Put() was too large.
Jack Low
In my case, I converted the data received from Flickr from XML into a CSV-ish format containing just the data I actually wanted to save, but storing plain text if you're dealing with an already nice-formatted dataset probably works better.The RequestTooLargeError indicates your initial entity is bigger than 1MB; you probably want to split it into smaller chunks in the initial handler.
Wooble
A: 

If you're doing this to bulk load data, why not use the bulk loader?

If you need the interface to be accessible to non-admin users, then, as suggested, you need to break the file up into decent sized chunks (by taking blocks of n lines each) put them into the datastore, and start a task to deal with each of them.

Nick Johnson