I am expecting a really large number
of results so in-memory-sorting won't
be efficient.
You can get at most 1000 results from a query, so sorting them every which way will be extremely efficient -- e.g., consider, on my Macbook Air (first-day, slowest model):
$ python -mtimeit -s'import random; x=range(1000); random.shuffle(x)' 'y=sorted(x)'
1000 loops, best of 3: 714 usec per loop
AppEngine's CPUs are substantially faster than the Air, so 700 microseconds to sort 1000 results would be a very pessimistic estimate; this compares with tens of msec for fetching the data -- so, don't worry at all about sorting: as long as you can fetch the results you want, you'll be just fine.
BTW, to assess likely App Engine performance on a task, see Guido van Rossum's presentation here -- he claims 10-50 msec for a "typical db.get()
" (50-100 for put
, etc).
If you expect more than 1000 results from a query, that typically means you need to denormalize the table to trim the query's results to under 1000. For example, in your case, say that you expect about 500-700 entries a day -- in which case getting all results for today would be no problem, but a week is definitely a problem: you need to trim the query down to 20% or less of what it would "normally" be.
For example, say your scores are in the range 0-100, about uniformly distributed. In this case, you might add to the entity a boolean field "topcandidate": when you save the entity, you set that field to True iff the score is in the range 85-100 (as you see, this means the table gets denormalized, since the field represents logically redundant information).
When you fetch weekly top results, you add the equality condition to have topcandidate True. Instead of 3500-4900 results, this should get you down to 500-900 -- the top-scoring 15% or so, after which you can sort them in memory and pick, say, the top 100.
Of course the exact numbers depend on the distribution of your score
field (more likely to be a bell curve than a flat uniform distribution) and how many "top score candidates" you need, but this is one generally useful approach to work around the 1000-results limit.