views:

194

answers:

3

I have to label something in a "strong monotone increasing" fashion. Be it Invoice Numbers, sipping label numbers or the like.

  1. A number MUST NOT BE used twice
  2. Every number SHOULD BE used when exactly all smaller numbers have been used (no holes).

Fancy way of saying: I need to count 1,2,3,4 ... The number Space I have available are typically 100.000 numbers and I need perhaps 1000 a day.

I know this is a hard Problem in distributed systems and often we are much better of with GUIDs. But in this case for legal reasons I need "traditional numbering".

Can this be implemented on Google AppEngine (preferably in Python)?

+2  A: 

Take a look at how the sharded counters are made. It may help you. Also do you really need them to be numeric. If unique is satisfying just use the entity keys.

Ilian Iliev
No, it has to be numeric. I'm given the number space by somebody else (think of UPS or FedEx)
mdorseif
+9  A: 

If you absolutely have to have sequentially increasing numbers with no gaps, you'll need to use a single entity, which you update in a transaction to 'consume' each new number. You'll be limited, in practice, to about 1-5 numbers generated per second - which sounds like it'll be fine for your requirements.

Nick Johnson
Thanks for the Answer! If I weaken the no-Gaps (2.) requirement what are the alternatives? Unfortunately my number needs come in bursts of 33 a time (a truckload with 33 Pallets http://blogs.23.nu/disLEXia/2007/12/antville-16865/#).
mdorseif
Come to think of it: It's o problem to consume 33 numbers at once in one transaction. Still: I wonder what the alternatives for the 1 Requirement are.
mdorseif
If they don't have to be sequential, you can simply use the datastore's built in ID generation functionality. See allocate_ids on this page for details: http://code.google.com/appengine/docs/python/datastore/functions.html
Nick Johnson
+1  A: 

If you drop the requirement that IDs must be strictly sequential, you can use a hierarchical allocation scheme. The basic idea/limitation is that transactions must not affect multiple storage groups.

For example, assuming you have the notion of "users", you can allocate a storage group for each user (creating some global object per user). Each user has a list of reserved IDs. When allocating an ID for a user, pick a reserved one (in a transaction). If no IDs are left, make a new transaction allocating 100 IDs (say) from the global pool, then make a new transaction to add them to the user and simultaneously withdraw one. Assuming each user interacts with the application only sequentially, there will be no concurrency on the user objects.

Martin v. Löwis