views:

38

answers:

1

App Engine Datastore cannot be queried for an aggregate result.

Example: I have an entity called "Post" with the following fields:

Key id, String nickname, String postText, int score

I have many different nicknames and many posts by each nickname in my datastore.

If I want a leader board of the top ten nicknames of total scores, I would typically have sql as follows:

select nickname, sum(score) as sumscore
from Post 
group by nickname 
order by sumscore 
limit 10

This type of query is not possible in google app engine datastore java api (jdo or jpa).

What are alternative strategies that I could use to achieve a similar result?

Crudely and brutely, I could load every Post entity and compute the aggregation fully in my application code. This is obviously not efficient on large datasets.

What other strategies can I employ?

+4  A: 

Create a Nickname model, and each time you add a new Post, retrieve the corresponding Nickname and increase a stored score sum there. Essentially, do the computation at insert/update-time, not query-time.

Amber
Hi Amber. Thank you for your contribution. I am already doing this to some degree. (My model is more complex than i described it). I am already aggregating a lot of data on inserts and updates to go around this. But it is not feasible to save every possible aggregated statistic in this way (i have a lot of different aggregate statistics that i would like to calculate every now and then). But this is still a valid answer.
Patrick
Amber's approach is right, and it will scale. I use an approach very similar to 'fan-in with materialized views' (http://code.google.com/events/io/2010/sessions/high-throughput-data-pipelines-appengine.html) to compute dozens of aggregates. It works quite well.
Robert Kluin
I am using this technique, coupled with sharding to minimise contention (http://code.google.com/appengine/articles/sharding_counters.html); as well as being able to atomically postpone the updates of such counters and statistics. I am marking this answer as the best one.
Patrick