I am looking for an approach that will let me assign ordinal numbers 0..(N-1) to N O/S threads, such that the threads are in numeric order. That is, the thread that gets will have a lower O/S thread ID than the thread with ordinal 1.
In order to carry this out, the threads communicate via a shared memory space.
The memory ordering model is such that writes will be atomic (if two concurrent threads write a memory location at the same time, the result will be one or the other). The platform will not support atomic compare-and-set operations.
I am looking for an algorithm that is efficient in the number of writes to shared memory, and will complete rapidly with up to tens of thousands of threads, and with no nasty worse-case thread arrival conditions.
The O/S will assign thread numbers in arbitrary order throughout a 32-bit space. There may be arbitrary thread creation delays - the algorithm can be considered complete when all N threads are present.
I am unable to use the obvious solution of collecting all the threads, and then having one thread sort them - without an atomic operation, I have no way of safely collecting all the individual threads (another thread could rewrite the slot).