views:

2439

answers:

5

I'm curious about the underlying implementation of static variables within a function.

If I declare a static variable of a fundamental type (char, int, double, etc.), and give it an initial value, I imagine that the compiler simply sets the value of that variable at the very beginning of the program before main() is called:

void SomeFunction();

int main(int argCount, char ** argList)
{
    // at this point, the memory reserved for 'answer'
    // already contains the value of 42
    SomeFunction();
}

void SomeFunction()
{
    static int answer = 42;
}

However, if the static variable is an instance of a class:

class MyClass
{
    //...
};

void SomeFunction();

int main(int argCount, char ** argList)
{
    SomeFunction();
}

void SomeFunction()
{
    static MyClass myVar;
}

I know that it will not be initialized until the first time that the function is called. Since the compiler has no way of knowing when the function will be called for the first time, how does it produce this behavior? Does it essentially introduce an if-block into the function body?

static bool initialized = 0;
if (!initialized)
{
    // construct myVar
    initialized = 1;
}
+4  A: 

In the compiler output I have seen, function local static variables are initialized exactly as you imagine.

Note that in general this is not done in a thread-safe manner. So if you have functions with static locals like that that might be called from multiple threads, you should take this into account. Calling the function once in the main thread before any others are called will usually do the trick.

I should add that if the initialization of the local static is by a simple constant like in your example, the compiler doesn't need to go through these gyrations - it can just initialize the variable in the image or before main() like a regular static initialization (because your program wouldn't be able to tell the difference). But if you initialize it with a function's return value, then the compiler pretty much has to test a flag indicating if the initialization has been done or something equivalent.

Michael Burr
+1  A: 

You're right about everything, including the initialized flag as a common implementation. This is basically why initialization of static locals is not thread-safe, and why pthread_once exists.

One slight caveat: the compiler must emit code which "behaves as if" the static local variable is constructed the first time it is used. Since integer initialization has no side effects (and calls no user code), it's up to the compiler when it initializes the int. User code cannot "legitimately" find out what it does.

Obviously you can look at the assembly code, or provoke undefined behaviour and make deductions from what actually happens. But the C++ standard doesn't count that as valid grounds to claim that the behaviour is not "as if" it did what the spec says.

Steve Jessop
+1  A: 

I know that it will not be initialized until the first time that the function is called. Since the compiler has no way of knowing when the function will be called for the first time, how does it produce this behavior? Does it essentially introduce an if-block into the function body?

Yes, that's right: and, FWIW, it's not necessarily thread-safe (if the function is called "for the first time" by two threads simultaneously).

For that reason you might prefer to define the variable at global scope (although maybe in a class or namespace, or static without external linkage) instead of inside a function, so that it's initialized before the program starts without any run-time "if".

ChrisW
+3  A: 

This question covered similar ground, but thread safety wasn't mentioned. For what it's worth, C++0x will make function static initialisation thread safe.

(see the C++0x FCD, 6.7/4 on function statics: "If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization.")

One other thing that hasn't been mentioned is that function statics are destructed in reverse order of their construction, so the compiler maintains a list of destructors to call on shutdown (this may or may not be the same list that atexit uses).

James Hopkin
@James Hopkin - Can you give a reference/citation to its being thread safe in C++0x? I haven't found one.
ChrisW
+1  A: 

Another twist is in embedded code, where the run-before-main() code (cinit/whatever) may copy pre-initialized data (both statics and non-statics) into ram from a const data segment, perhaps residing in ROM. This is useful where the code may not be running from some sort of backing store (disk) where it can be re-loaded from. Again, this doesn't violate the requirements of the language, since this is done before main().

Slight tangent: While I've not seen it done much (outside of Emacs), a program or compiler could basically run your code in a process and instantiate/initialize objects, then freeze and dump the process. Emacs does something similar to this to load up large amounts of elisp (i.e. chew on it), then dump the running state as the working executable, to avoid the cost of parsing on each invocation.

jesup