I read today about sharded counters in Google App Engine. The article says that you should expect to max out at about 5/updates per second per entity in the data store. But it seems to me that this solution doesn't 'scale' unless you have some way of knowing how many updates you are doing per second. For example, you can allocate 10 shards, but will then start choking at 50 updates per second.
So how do you know how fast the updates are coming, and how do you feed that number back into the number of shards?
My guess is that along with the counter you could keep some record of recent activity, and if you detect a spike you can increase the number of shards. Is that generally how it's done? And if so, why isn't it done in the sample code? (That last question may be unanswerable.) Is it more common practice to monitor website activity and update shard counts as traffic rises, as opposed to doing it automatically in the code?
Update: What are the practical consequences effects of having too few shards and choking? Does it simply mean that the website becomes unresponsive, or is it possible to lose counter updates because of timeouts?
As an aside, this question talks about implementing counters without sharding, but one of the answers impies that even memcache needs to be sharded if traffic is high. So this issue of shard allocation and tuning seems to be important.