In the general case it's difficult for a compiler to know exactly which objects a function might have access to and therefore could potentially modify. At the point where putchar()
is called, GCC doesn't know if there might be a putchar()
implementation that might be able to modify running
so it has to be somewhat pessimistic and assume that running
might in fact have been changed.
For example, there might be a putchar()
implementation later in the translation unit:
int putchar( int c)
{
running = c;
return c;
}
Even if there's not a putchar()
implementation in the translation unit, there could be something that might, for example, pass the address of the running
object such that putchar
might be able to modify it:
void foo(void)
{
set_putchar_status_location( &running);
}
Note that your handler()
function is globally accessible, so putchar()
might call handler()
itself (directly or otherwise), which is an instance of the above situation.
On the other hand, since running
is visible only to the translational unit (being static
), by the time the compiler gets to the end of the file it should be able to determine that there is no opportunity for putchar()
to access it (assuming that's the case), and the compiler could go back and 'fix up' the pessimization in the while loop.
Since running
is static, the compiler might be able to determine that it's not accessible from outside the translation unit and make the optimization you're talking about. However, since it's accessible through handler()
and handler()
is accessible externally, the compiler can't optimize the access away. Even if you make handler()
static, it's accessible externally since you pass the address of it to another function.
Note that in your first example, even though what I mentioned in the above paragraph is still true the compiler can optimize away the access to running
because the 'abstract machine model' the C language is based on doesn't take into account asynchronous activity except in very limited circumstances (one of which is the volatile
keyword and another is signal handling, though the requirements of the signal handling aren't strong enough to prevent the compiler being able to optimize away the access to running
in your first example).
In fact, here's something the C99 says about the abstract machine behavior in pretty much these exact circumstances:
5.1.2.3/8 "Program execution"
EXAMPLE 1:
An implementation might define a one-to-one correspondence between abstract and actual semantics: at every sequence point, the values of the actual objects would agree with those specified by the abstract semantics. The keyword volatile
would then be redundant.
Alternatively, an implementation might perform various optimizations within each translation unit, such that the actual semantics would agree with the abstract semantics only when making function calls across translation unit boundaries. In such an implementation, at the time of each function entry and function return where the calling function and the called function are in different translation units, the values of all externally linked objects and of all objects accessible via pointers therein would agree with the abstract semantics. Furthermore, at the time of each such function entry the values of the parameters of the called function and of all objects accessible via pointers therein would agree with the abstract semantics. In this type of implementation, objects referred to by interrupt service routines activated by the signal function would require explicit specification of volatile storage, as well as other implementation defined restrictions.
Finally, you should note that the C99 standard also says:
7.14.1.1/5 "The signal
function`
If the signal occurs other than as the result of calling the abort
or raise
function, the behavior is undefined if the signal handler refers to any object with static storage duration other than by assigning a value to an object declared as volatile sig_atomic_t
...
So strictly speaking the running
variable may need to be declared as:
volatile sig_atomic_t running = 1;