ansaurus

Question

gettimeofday clock_gettime solution to generate unique number

Answer 1

A:

Using a time stamp as a unique ID will never work reliably unless you limit yourself to only one transaction per lowest clock tick (1 millisecond in this case).

Since you are stuck using a time value for the first 6 of 9 bytes you need to try to fit as much range into the last 3 bytes as possible.

If you can get away with not using ASCII characters in the last 3 bytes then you should avoid it since that will limit the values it can have a great deal. If possible you should try to use these bytes as a 24 bit integer (range of 16777216) and just have each transaction increment the counter. You could then set it back to 0 each time that gettimeofday let you know that the time had changed. (or you could set up an repeating SIGALRM to let you know when to call gettimeofday again to update your time and 0 the 24 bit integer).

If you are forced to use ASCII printable characters for these bytes then things are a little bit more difficult. The easiest way to extend the range of this would be to use hexadecimal rather than decimal numbers. This grows your representable range from 1000 to 4096. You can do better if you use an even broader number base, though. If you tacked on the first 22 characters of the alphabet (the same way that tacking on the first 6 letters is done for hex) then you can represent 32x32x32 values, which is 32768. That would be a lot of transactions per second. You can do even better if you extend your numeric alphabet even further, but it will become more piecemeal as you do since you will probably want to restrict some characters from being appearing in the value. Using a representation that strtol or strtoul can easily work with will likely be easier to program.

If your application is multithreaded then you may want to consider taking up part of your numeric range as a thread ID and let each thread keep its own transaction counter. This will make determining the relative time between two transactions processed by different threads more difficult to calculate, but it will keep the threads from all wanting to increment the same memory location (which may require a mutex or semaphore).

nategoose 2010-10-13 19:51:21

It won't even work if you limit yourself to one transaction per clock tick, because there's no guarantee that the clock doesn't occasionally run a little bit slow (e.g. when being slewed by ntpd, if it got a little ahead) and hence stay apparently "the same" for longer than that.

MarkR 2010-10-13 19:56:54

If the clock is running slow then it doesn't tick, does it? I was speaking with a monotonic clock in mind. One with no actual period -- just an ever increasing value. If the value moves backwards or held in the same state for a _long_ time (ntpd adjustment) then you're pretty much screwed if you are using this clock as any part of your timestamp.

nategoose 2010-10-13 20:10:21

Although the explanation doesn't give a direct solution due to my legacy limitations and limited problem description on my part, it opened up many avenues that I am yet to explore. Thank you.

Kiran 2010-10-13 21:11:03

Answer 2

A:

Generally using clock time on a heavy loaded system like this with a resolution under a second is a bad idea anyhow. Threads will take their timestamp and then be descheduled in the middle of the operation, so you will see things arriving out of order.

Three characters left to encode things uniquely is not much. Try at least to use some different encoding such as base64.

If you use gcc as a compiler you have thread local storage (TLS) as an extension that is quite efficient. Just prefix your static variable with __thread (or so). If you are restricted to phtreads, there are means to get thread specific keys, too, pthread_get_key. But better would be to have the information as long as possible on the stack of the thread.

To obtain a per thread counter that makes a serial number for your request use

your hhmmss timestamp as so far
as much bits that you need to identify your threads
the last bits for the per thread serial number as above that should only wrap round after more than a second

You could even be cheating and yield a thread that fires too many requests within the same second.

Jens Gustedt 2010-10-13 19:51:29

Answer 3

A:

I guess you could give each thread of each process a unique ID at startup, I guess this would take only one of the 3 available characters unless you have hundreds of threads. You can then use a local counter per-thread to set the last two characters (using base64 or even more, depending on what characters are allowed, to get enough amplitude).

In this situation the only case where a collision may happen is if the counter of a thread wraps during the same second.

Of course, this is a dirty hack. The Right Way would be to share a ressource amongst the threads/processes. It might be the simplest solution in your case tho.

simias 2010-10-13 19:57:29

Answer 4

+1 A:

The type of problem you are describing has already been more-or-less solved by issuing a UUID. This is a system that is designed to solve all the problems you mention and some more.

A linux library: http://linux.die.net/man/3/uuid

More information is available here: http://en.wikipedia.org/wiki/Universally_unique_identifier

teambob 2010-10-13 23:02:21

Thanks, this would be helpful in the future for me.

Kiran 2010-10-14 13:02:16

ansaurus

tags:

views:

answers:

gettimeofday clock_gettime solution to generate unique number

related questions