views:

57

answers:

3

I have a User model for a game system. Need to increase the points by 100 every hour.

# the key_name is the userid in this case
class User(db.Model):
  points = db.IntegerProperty(default=0)

so should prepare a handler which does a GQL query across all entities? ( wouldn't that be a little slow with 500k - 1 million user entities? )

eg:

users = User.all() # if i'm not mistaken, only 1000 queries can be done.
for user in users:
  user.points += 100
  db.put(user)

i suppose using taskqueues, and sharding counters to overcome the 1000 limiy, I could pull it off

but then again, why don't I just take the time difference of when the user last logged in, and if it's N number of hours, I'll award the user N * 100 points? that should reduce the load on my application.

eg: class User(db.Model): lastlogin = db.DateTimeProperty() points = db.IntegerProperty(default=0)

what do you guys think?

+4  A: 

but then again, why don't I just take the time difference of when the user last logged in, and if it's N number of hours, I'll award the user N * 100 points? that should reduce the load on my application.

Yes, that is a much more efficient approach. That way, you only update the points once per user login, instead of updating every user record every hour, which would be very expensive.

gavinb
+1  A: 

Two thoughts on this:

  • Don't worry about 500K - 1M user entries. I don't know you or your game but I'd be very surprised if you get more than 1K.

  • If there's an algorithmic way to allocate the points once rather than every hour, this will be MUCH preferable. Definitely do that, then. Question arises: Are these point increments also accrued while the user is online? If so, you need to build in a check on every action. On the other hand, if you're doing this anyway, then you don't need a check at login time.

Carl Smotricz
+1  A: 

Paging through large datasets discusses techniques for doing things like this - it's written in the context of displaying X items per page on a form, but the concepts are the same.

You can further split up the work by putting the actual updates in a deferred task.

However, as you've suggested, it's probably more efficient to only calculate this on demand.

James Polley