views:

140

answers:

3

I tried to do something like

ss = Screenshot(key=db.Key.from_path('myapp_screenshot', 123), name='flowers') db.put([ss, ...])

It seems to work on my dev_appserver, but on live I get this traceback:

05-07 09:50PM 19.964 File "/base/data/home/apps/quixeydev3/12.341796548761906563/common/appenginepatch/appenginepatcher/patch.py", line 600, in put
E 05-07 09:50PM 19.964 result = old_db_put(models, *args, **kwargs) GAE bug is actually extremely minor. It seems to be due to the fact that I called something l E 05-07 09:50PM 19.964 File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/init.py", line 1278, in put
E 05-07 09:50PM 19.964 keys = datastore.Put(entities, rpc=rpc)

E 05-07 09:50PM 19.964 File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 284, in Put
E 05-07 09:50PM 19.965 raise _ToDatastoreError(err)

E 05-07 09:50PM 19.965 InternalError: the new entity or index
you tried to insert already exists

I happen to know just the ID of an existing Screenshot entity I want to update; that's why I was manually constructing its key. Am I doing it wrong?

Update: I filed this as Google App Engine issue 3209.

Update 2: It looks like the ike db.put([ss, ss2, ss]), i.e. the call just fails when the list references the same model twice.

Update 3: OK, I think I finally know what's going on here, and I'm updating this question because right now it's the only Google result for "the new entity or index you tried to insert already exists".

This InternalError seems to arise when the Datastore attempts two writes of a BigTable row for the same entity key, within the same Remote Procedure Call. This can happen if you manually give two entities the same key, or if you put() two new entities without specifying a key, and the same ID gets allocated to both entities in parallel. In the latter case, the solution is to use db.allocate_ids().

A: 
Gabriel
Thanks, but I'm pretty sure db.put() is supposed to be the way to do bulk updates. I can't afford to hit the datastore separately for each entity.
Liron Shapira
oh, maybe try doing the put on the collection without any parameters.
Gabriel
What do you mean "without any parameters"? How is that different from my code in the question?
Liron Shapira
I am telling you exactly what Emilien is telling you. I do not know if it will work but let us say for a moment that:ss = Screenshot(key=db.Key.from_path('myapp_screenshot', 123), name='flowers')instead of trying to insert the records from ss to the db (db.put(ss,...)) Try just putting the collection itself. ss.put(). If you are querying them one at a time there is no benefit to "updating them in bulk". But you might try to do this by collecting the objects in a variable and sending a put().
Gabriel
As above, there's absolutely no behaviour al difference between the two methods.
Nick Johnson
A: 

If you want to update just one entity, you can use Model.put() instead of db.put(). This will create the entity if it doesn't already exist, or update it if it does.

So you would do ss.put().

Emilien
Unfortunately it's essential that I do the updates in bulk. I'm wondering if anyone knows exactly what the error in the question means, since Google has zero results for it.
Liron Shapira
Why is it "essential [to] do the updates in bulk"? If it is for performance reasons (i.e. not hit the max request time), you could better use offline processing and add a task for each entity to update (using Task Queues). That way you will not reach the time limits, even if you had a million entities to update. If time is not the constraint, use a transaction (that's what the db.put is doing anyway if the entities you put() belong to a same entities group).
Emilien
The behaviour of calling Model.put should be no different to that of db.put, so this isn't the problem.
Nick Johnson
+2  A: 

This is a bug - you should be able to do exactly what you describe. As a workaround until we can fix it, using key names (even if they're numeric) instead of IDs should work fine.

Nick Johnson
Is there an issue for this on [Google code issues page](http://code.google.com/p/googleappengine/issues/)? I probably overlooked it when searching for it, but I would like to star it.
David Underhill
If you can't find one, please do file a new bug!
Nick Johnson
Thanks, now I feel proud of myself for breaking the great GAE platform :)I can't use key names as a workaround because I still want the datastore to auto-generate unique IDs. I still don't understand what I even did to trigger the bug. I'm guessing it's my manual setting of an entity's key, and I can workaround by doing a get() first and having GAE construct the keys.
Liron Shapira
Thanks for creating [the issue](http://code.google.com/p/googleappengine/issues/detail?id=3209) Lion; I've starred it too :).
David Underhill
@Liron In the worst case, you could use the allocate_ids call to request an ID, then stringify it and set it as a key name, but I agree that that's a horrid hack.
Nick Johnson
Hey, it looks like the GAE bug is actually extremely minor. It seems to be due to the fact that I called something like db.put([ss, ss2, ss]), i.e. the call just fails when the list references the same model twice.
Liron Shapira