views:

56

answers:

2

Each of my users is polling the server every few seconds. I need to keep a list of the users that have polled in the last 30 seconds handy for a task I have queued to run every few seconds.

The obvious way I see to do it is to update a datastore entry every time the user polls, and query the entries that have a timestamp within the last N seconds within my task queue. I can't imagine this scaling well.

Any recommendations?

Thanks.

+2  A: 

I've benchmarked writes for small entities at around 1/20 of a second each. If you base the key of your poll entities on a unique value associated with a user, then you can access the poll entities by key when you update them which is a very fast hash instead of a search.

The writes should be fine as long as you don't need to write to entities simultaneously (which it sounds like you won't be doing since you have one for each user). Just make sure that you don't put the poll entities in the same entity group, if you do that, writing to one will freeze all of the others during the write.

You can query for all poll entities that were updated in the last 30 seconds, this is a read and should be fast.

DutrowLLC
The thing I am worried about is the final query to create my list of current users. If I have 1000 users polling, this means I have to fetch 1000 entities. Or are you saying that the query will be fast if I assign a key to each user? Thanks.
synapz
I think that query will actually be slow. You'll probably even have to page it into several different queries. Does it need to return immediately? If you run the task every few seconds, I wonder if you can just poll for the users who've polled since last time you did the query? Another option would be to use perhaps 50 entities each holding a list of recently polled users. You could user the user's key to form a hash that would determine which entity to update when they poll.
DutrowLLC
A: 

Having one datastore entity per user, and updating that with the current time each time they poll should work just fine - though it will, of course, add a little latency to every user request.

Nick Johnson