views:

106

answers:

3

I'm not even sure the following is possible, but it never hurts to ask:

I have two nodes running the same application. Each machine needs a sequence generator that will give it a number between 0 and 1e6. If a node has used a number, the other node must not use it. The generator should reset every night at midnight. No number should be used twice in the same day, even if a machine restarts. We'd like to avoid any solution involving databases, distributed caches or filesystems. Let's assume we will never need more than 1e6 numbers per day. The numbers do not have to be used in sequence.

So far we have thought of the following:

1) Machine A uses odd numbers, machine B uses even numbers.

Pros: no shared state.

Cons: a machine might run out of numbers when there are plenty left. If a machine restarts, it will reuse previously used numbers.

2) Machine A countr from 0 to 1e6, machine B from 1e6 to 0.

Pros: no shared state. Garantees that all available numbers will be consumed before running into problems.

Cons: doesn't scale to more than two machines. Same problem when a machine restarts.

What do you think? Is there a magic algorithm that will fulfill our requirements without needing to write anything to disk?

+2  A: 

Why not just have a small serivce that hands out IDs upon request? This scales to more than one machine, and doesn't require a change to the client if you need to change the ID allocation algorithm. This is rather simple to implement and quite easy to maintain going forward.

Jason
+2  A: 

No number should be used twice in the same day, even if a machine restarts.

Since you don't want to use any persistent state, this suggests to me that the number must depend on the time somehow. That is the only way in which the algorithm can tell two distinct startups apart. Can't you just use a combination (node, timestamp) for sufficiently fine timestamps, instead of your numbers?

Thomas
A: 

I really think the best way would be to have some machine that hands out numbers on requests (maybe even number ranges if you want to avoid too many queries) that wrote things out to disk.

If you're really against it, you could be really clever with method 1 if you can gaurantee the rate at which numbers are consumed. For example the machine could use the current time to determine where in its range to begin. I.E. if it's noon, begin at the middle of my range. This could be tweaked if you can put an upper limit on the amount of numbers generated per second (or generic time interval). This still has the problem of wasted tags and is pretty convoluted just to avoid writing a single number to disk.

Falaina