views:

18

answers:

0

I'm working on a system conceptually similar to an URL shortener like bit.ly -- a client sends in a bit of data and the system returns a very short URL to represent it (a base URL + a token representing the data). Ideally, it would start with the shortest token possible (maybe 4 or 5 alphanumeric characters) and move on to longer tokens only once the short ones are all consumed. It's not crucial that tokens be unguessable (if it were, I'd use longer tokens), but I'd also like for it to not be trivial to find them all, so I'd like to use only some fraction of the possible values. Ideally they'd be used in a random order (within each length), but I wouldn't say that's crucial.

I'd like to use Amazon Web Services for as much of this as I can. I'm struggling a bit with the best way to generate the tokens in a scalable manner. By scalable, I mean that I could have many requests in-flight at once without danger of race conditions and without them having to do some sort of complicated checking to see if the token they've chosen is available (e.g. the naive approach of randomly choosing a value, checking if it's used, and starting over if it is).

It seems like one obvious approach is something like an Oracle sequence, where each request could atomically grab the next value from a pre-allocated set and use it without concern. However, I don't see a great way to do that within AWS. I could potentially use SQS to do it, but SQS only lets queued values live for four days, so I'd have to keep re-populating the queue as things expire -- I'd like to generate a huge sequence once and forget about it for a long time. I could probably use RDS, but that seems like overkill. I don't really want to use something UUID-esqe (i.e. using a massive value to avoid collisions), because that defeats the purpose -- I need very short tokens.

Other ideas?