views:

61

answers:

1

Hi. I've written an app that scans the internet and saves some data it retrieves from there. After a while the percentage of Datastore quota (Total stored data) jumped from 7% to 99%. I stopped my robot, but the figure raised to 100% after some time. The Datastore stats, though, says that the total volume of data stored in the datastore is about 200MB, the total number of entities is 501,000...

Does anyone know why can that be?

Thank you in advance.

Tim.

+2  A: 

It could be indexes. If you have many indexed properties, especially list properties, the data storage number can easily be several times higher than the stats stored data number.

There is a good article explaining how space is used. http://code.google.com/appengine/articles/storage_breakdown.html

You can also star issue 2740 to request that statistics are provided for indexes too. http://code.google.com/p/googleappengine/issues/detail?id=2740

Robert Kluin
Well, i had an entity with small lists of strings, which i thought might be the reason for clogging my db indexes. So i deleted it. However, the stats say i still use 99% of my quota...
Ibolit
It can take some time for all of the data usage numbers to be updated. Do you have custom indexes defined, particularly those involving string properties?
Robert Kluin
Well, i don't have any custom (or automatically generated by Eclipse plugin) indexes on the entity i deleted. I have a two-fiel index, consisting of two integer fields on another entity.And there is another thing i think might be important: it says in my Dashboard: "Resource is currently experiencing a short-term quota limit." Would it be called "a short-term quota limit" if the datastore were used to the brim?
Ibolit
If you are using the task queue, you might want to check task storage usage. You can find that under the tasks section in the admin console. I created issue 2740 because it is often very unclear which indexes are actually using your data quota -- if you have not already, star it. The short term quota message is probably unrelated your storage usage.
Robert Kluin
Thank you very much for all your help and attention. For the past several days i've been sending to myself and deleting data from the datastore. And now, when a considerable part of it has been deleted, the datastore usage level dropped. However, i did *not* have any (explicit) indexes on this entity.
Ibolit
You might want to define any properties you do not query on as unindexed, in python you do this by passing indexed=False to the property definition, i.e. my_prop = db.IntegerProperty(indexed=False). I have seen other reports of the datastore usage value behaving just as you report. Steady unexplained increase, then after several days sudden drop. Perhaps this has to do with BigTable's compaction cycles, but I am not sure about that.
Robert Kluin