views:

104

answers:

2

Hello!

I was recently interested in how does Microsoft Visual C++ compiler makes and optimizes static variables. I have the following code:

void no_static_initialization() {
   static int value = 3;
}

void static_initialization(int new_value) {
   static int value = new_value;
}

#include <cstdlib>

int main() {

   no_static_initialization();

   static_initialization(1);

   static_initialization(rand());

   return 0;
}

My main area of interest, was, of course, the last case.

I was interested in the following: "does the compiler has to make explicit checks on every function call if the function is called the first time (then static value is overwritten) or not the first time (then nothing happens).

I have the assembly listing and would be proud if someone could help me how this is implemented (compiling with optimizations).

alt text

Here, first statement got fully optimized and two calls of the second statement were inlined and they actually represent similiar chunks of code.

Each of them does test something something and then makes a short jump if the test wasn't successful (these jumps obviously point to the end of corresponding routine).

Could someone give a better explanation of what is happening here? Does the compiler actually have a flag, which indicates if this is the first time the function was called or not? Where is it stored (I guess all that test stuff is about it, but I'm not exactly sure).

Thank you.

+6  A: 

Yes, the compiler has to add a hidden flag to test whether it is the first call to the function and initialize or not depending on that. In both snippets it is testing the flag, if it is raised it will jump to the end of the function or else it will initialize the static variable. Note that since the compiler has inlined the function it could as well optimize away the second test, knowing that the flag is to be tested only on the first call.

The flag seems to be located at address 0x00403374, and takes a byte, while the variable itself is located at address 0x00403370.

David Rodríguez - dribeas
So far as I can see, the compiler can perfectly well optimize out any "first run" checks for a static local initialized with a literal, and use the same mechanism it uses for global static initialization for it. This would make any flags etc unnecessary. You wouldn't be able to tell the difference by standard-conforming means, so "as if" rule applies.
Pavel Minaev
@Pavel Minaev: I have reread your comment a couple of times. My understanding of what you mean is: *A local static that gets initialized by a constant can be implicitly converted to a global static and avoid the extra flag/check*. If that is what you meant, in the general case you cannot. Imagine that a local static variable has a constructor that takes an int literal. Because the constructor may have unknown side effects, the compiler cannot move the execution of that constructor prior to main execution, or it could break the *as-if* rule, and potentially cause UB in a well defined program.
David Rodríguez - dribeas
Consider: `extern Logger logger; struct test { test( int ) { logger.log( "test created" ); }}; void foo() { static test t(5); } int main() { foo(); }` If `logger` is a global variable initialized in a different translation unit the program (the part we see) is well defined. By the time `foo` is called, `logger` must already be initialized, so calling a method on it is valid. If the compiler decided to move `t` outside of `foo` and treat it as a global variable, the order of initialization of `foo::t` and `logger` is undefined and the program falls in Undefined Behavior land.
David Rodríguez - dribeas
As always, in some cases the compiler *could* actually do it if it can guarantee that the initialization does not have any side effects (as with the arithmetic types) or it is a user defined type with trivial constructor or... possibly other situations. In all but the most basic examples of it, the analyzer would have to work quite hard to ensure that it can optimize one byte and a test.
David Rodríguez - dribeas
@David: I guess I wasn't clear enough - I was specifically referring to the case such as the one in the question (where, after all inlining and constant-folding, the end result is identical to static initialization). Naturally, if there are any _observable_ side effects, the situation is different.
Pavel Minaev
+1  A: 

I like to use LLVM because the code it generates tells you a bit more explicitly what it's doing:

The actual code is below, because it's kind of a long read. Yes, LLVM creates guard condition variables for static values. notice how static_initialization/bb: acquires the guard, checks to see if its a certain value corresponding with already initialized, and either branches to bb1 if it needs to initialize, or bb2 if it doesn't. This isn't the only way to possibly solve the single initialization requirement, but it's the usual way.

; ModuleID = '/tmp/webcompile/_31867_0.bc'
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
target triple = "x86_64-linux-gnu"

@guard variable for static_initialization(int)::value = internal global i64 0 ; <i64*> [#uses=3]
@static_initialization(int)::value = internal global i32 0 ; <i32*> [#uses=1]

define void @no_static_initialization()() nounwind {
entry:
  br label %return

return:                                           ; preds = %entry
  ret void
}

define void @static_initialization(int)(i32 %new_value) nounwind {
entry:
  %new_value_addr = alloca i32                    ; <i32*> [#uses=2]
  %0 = alloca i8                                  ; <i8*> [#uses=2]
  %retval.1 = alloca i8                           ; <i8*> [#uses=2]
  %"alloca point" = bitcast i32 0 to i32          ; <i32> [#uses=0]
  store i32 %new_value, i32* %new_value_addr
  %1 = load i8* bitcast (i64* @guard variable for static_initialization(int)::value to i8*), align 1 ; <i8> [#uses=1]
  %2 = icmp eq i8 %1, 0                           ; <i1> [#uses=1]
  br i1 %2, label %bb, label %bb2

bb:                                               ; preds = %entry
  %3 = call i32 @__cxa_guard_acquire(i64* @guard variable for static_initialization(int)::value) nounwind ; <i32> [#uses=1]
  %4 = icmp ne i32 %3, 0                          ; <i1> [#uses=1]
  %5 = zext i1 %4 to i8                           ; <i8> [#uses=1]
  store i8 %5, i8* %retval.1, align 1
  %6 = load i8* %retval.1, align 1                ; <i8> [#uses=1]
  %toBool = icmp ne i8 %6, 0                      ; <i1> [#uses=1]
  br i1 %toBool, label %bb1, label %bb2

bb1:                                              ; preds = %bb
  store i8 0, i8* %0, align 1
  %7 = load i32* %new_value_addr, align 4         ; <i32> [#uses=1]
  store i32 %7, i32* @static_initialization(int)::value, align 4
  store i8 1, i8* %0, align 1
  call void @__cxa_guard_release(i64* @guard variable for static_initialization(int)::value) nounwind
  br label %bb2

bb2:                                              ; preds = %bb1, %bb, %entry
  br label %return

return:                                           ; preds = %bb2
  ret void
}

declare i32 @__cxa_guard_acquire(i64*) nounwind

declare void @__cxa_guard_release(i64*) nounwind

define i32 @main() nounwind {
entry:
  %retval = alloca i32                            ; <i32*> [#uses=2]
  %0 = alloca i32                                 ; <i32*> [#uses=2]
  %"alloca point" = bitcast i32 0 to i32          ; <i32> [#uses=0]
  call void @no_static_initialization()() nounwind
  call void @static_initialization(int)(i32 1) nounwind
  %1 = call i32 @rand() nounwind                  ; <i32> [#uses=1]
  call void @static_initialization(int)(i32 %1) nounwind
  store i32 0, i32* %0, align 4
  %2 = load i32* %0, align 4                      ; <i32> [#uses=1]
  store i32 %2, i32* %retval, align 4
  br label %return

return:                                           ; preds = %entry
  %retval1 = load i32* %retval                    ; <i32> [#uses=1]
  ret i32 %retval1
}

declare i32 @rand() nounwind
TokenMacGuy