views:

211

answers:

6

Will the following piece of code work as expected in a multi-threaded scenario?

int getUniqueID()  
{  
    static int ID=0;  
    return ++ID;  
}

It's not necessary that the IDs to be contiguous - even if it skips a value, it's fine. Can it be said that when this function returns, the value returned will be unique across all threads?

+5  A: 

No, there is still a potential for races, because the increment is not necessarily atomic. If you use an atomic operation to increment ID, this should work.

WhirlWind
+14  A: 

No, it won't. Your processor will need to do the following steps to execute this code:

  • Fetch value of ID from memory to a register
  • Increment the value in the register
  • Store the incremented value to memory

If a thread switch occurs during this (non atomic) sequence, the following can happen:

  • Thread a fetches the value 1 to a register
  • Thread a increments the value, so the register now contains 2
  • Context switch
  • Thread b fetches the value 1 (which is still in memory)
  • Context switch
  • Thread a stores 2 to memory and returns
  • Context switch
  • Thread b increments the value it has stored in its register to 2
  • Thread b (also) stores the value 2 to memory and returns 2

So, both threads return 2.

FRotthowe
+2  A: 

If you just need some monotonically increasing (or very close to it) numbers across N threads, consider this (k is some number such that 2^k > N):

int getUniqueIDBase()  
{  
    static int ID=0;  
    return ++ID;  
}

int getUniqueID()
{
    return getUniqueIDBase() << k + thread_id;
}
Yuliy
That's pretty innovative. +1
WhirlWind
+2  A: 

getUniqueID has some at least two race conditions. While initializing ID and when incrementing ID. I've rewritten the function to show the data races more clearly.

int getUniqueID()  
{
   static bool initialized = false;
   static int ID;
   if( !initialized )
   {
      sleep(1);
      initialized = true;

      sleep(1);
      ID = 1;      
   }

   sleep(1);
   int tmp = ID;

   sleep(1);
   tmp += 1;

   sleep(1);
   ID = tmp;

   sleep(1);
   return tmp;
}

Incrementing is deceptive, it looks so small as to assume it is atomic. However it is a load-modify-store operation. Load the value from memory to a CPU register. inc the register. Store the register back to memory.

Using the new c++0x you could just use the std::atomic type.

int getUniqueID()  
{  
    static std::atomic<int> ID{0};  
    return ++ID;  
}

NOTE: technically I lied. zero initialized globals (including function statics) can be stored in the bss memory and will not need to be initialized once the program starts. However, the increment is still an issue.

caspin
the initialization of `ID` is thread-safe as initialization of variables with `static` storage duration is done on program startup
Christoph
+2  A: 

++ is not necessarily atomic, so no, this is not thread-safe. However, a lot of C runtimes provide atomic versions, eg __sync_add_and_fetch() for gcc and InterlockedIncrement() on Windows.

Christoph
A: 

Note: The word almost is used because a global variable will be initialized at process start (i.e. its constructor will be called before entering main) whereas the static variable inside a function will be initialized the first time the statement is executed.

Your question is wrong from the start:

ID generator with local static variable - thread-safe?

In C/C++, a variable that is static inside a function or inside a class/struct declaration behaves (almost) like a global variable, not a local stack-based one.

The following code:

int getUniqueID()  
{  
    static int ID=0;  
    return ++ID;  
}

Would be (almost) similar to the pseudo-code :

private_to_the_next_function int ID = 0 ;

int getUniqueID()  
{  
    return ++ID;  
}

with the pseudo-keyword private_to_the_next_function making the variable invisible to all other functions but getUniqueId...

Here, static is only hiding the variable, making its access from other functions impossible...

But even hidden, the variable ID remains global: Should getUniqueId be called by multiple threads, ID will be as thread safe as other global variables, that is, not thread-safe at all.

Edit : Lifetime of variables

After reading comments, I felt I was not clear enough with my answer. I am not using global/local notions for their scope signification, but for their lifetime signification:

A global will live as long as the process is running, and a local, which is allocated on the stack, will start its life when entering the scope/function, and will cease to exist the moment the scope/function is exited. This means a global will retain its value, while a local will not. This also means a global will be shared between threads, while a local will not.

Add to it the static keyword, which has different meanings depending on the context (this is why using static on global variables and on functions in C++ is deprecated in favour of anonymous namespaces, but I disgress).

When qualifying a local variable, this local ceases to behave like a local. It becomes a global hidden inside a function. So it behaves as if the value of a local variable was magically remembered between the function calls, but there's not magic: The variable is a global one, and will remain "alive" until the end of the program.

You can "see" this by logging the creation and destruction of an object declared static inside a function. The construction will happen when the declaration statement will be executed, and the destruction will happen at the end of the process:

bool isObjectToBeConstructed = false ;
int iteration = 0 ;

struct MyObject
{
   MyObject() { std::cout << "*** MyObject::MyObject() ***" << std::endl ; }
   ~MyObject() { std::cout << "*** MyObject::~MyObject() ***" << std::endl ; }
};

void myFunction()
{
   std::cout << "   myFunction() : begin with iteration " << iteration << std::endl ;

   if(iteration < 3)
   {
      ++iteration ;
      myFunction() ;
      --iteration ;
   }
   else if(isObjectToBeConstructed)
   {
      static MyObject myObject ;
   }

   std::cout << "   myFunction() : end with iteration " << iteration << std::endl ;
}


int main(int argc, char* argv[])
{
   if(argc > 1)
   {
      std::cout << "main() : begin WITH static object construction." << std::endl ;
      isObjectToBeConstructed = true ;
   }
   else
   {
      std::cout << "main() : begin WITHOUT static object construction." << std::endl ;
      isObjectToBeConstructed = false ;
   }

   myFunction() ;

   std::cout << "main() : end." << std::endl ;
   return 0 ;
}

If you launch the executable without parameters, the execution will never go through the static object declaration, and so, it will never be constructed nor destructed, as shown by the logs:

main() : begin WITHOUT static object construction.
   myFunction() : begin with iteration 0
   myFunction() : begin with iteration 1
   myFunction() : begin with iteration 2
   myFunction() : begin with iteration 3
   myFunction() : end with iteration 3
   myFunction() : end with iteration 2
   myFunction() : end with iteration 1
   myFunction() : end with iteration 0
main() : end.

But if you launch it with a parameter, then the object will be constructed at the third recursive call of myFunction, and destroyed only at the end of the process, as seen by the logs:

main() : begin WITH static object construction.
   myFunction() : begin with iteration 0
   myFunction() : begin with iteration 1
   myFunction() : begin with iteration 2
   myFunction() : begin with iteration 3
*** MyObject::MyObject() ***
   myFunction() : end with iteration 3
   myFunction() : end with iteration 2
   myFunction() : end with iteration 1
   myFunction() : end with iteration 0
main() : end.
*** MyObject::~MyObject() ***

Now, if you play with the same code, but calling myFunction through multiple threads, you'll have race conditions on the constructor of myObject. And if you call this myObject methods or use this myObject variables in myFunction called by multiple threads, you'll have race conditions, too.

Thus, the static local variable myObject is simply a global object hidden inside a function.

paercebal
Incorrect. A static local variable has local scope, but its value is retained across function calls. Global and Local refer to scope, not storage class.
WhirlWind
Static local variables can be _returned_ , so long as the caller does not modify them. They are still scoped in the function that allocated them.
Tim Post
@WhirlWind: I corrected my answer to clarify that I use the "global" and "local" words to specify the lifetime of a variable, which is what really matters in a multithread context.
paercebal
@Tim Post: Static local variables can be returned. In fact you could even return a reference/pointer to them and let the caller modify them. In a multithread context, this can have amusing results. The fact they are apparently "scoped" won't change the fact their lifetime is not the lifetime of a normal local variable, but the lifetime of a global one.
paercebal