views:

357

answers:

10

Everywhere I look there are people who argue vociferously that uninitialised variables are bad and I certainly agree and understand why - however; my question is, are there occasions when you would not want to do this?

For example, take the code:

char arrBuffer[1024] = { '\0' };

Does NULLing the entire array create a performance impact over using the array without initialising it?

A: 

And why do you care for the performance benefits, how much performance you will get by not initializing it, and does it is more than the time saved during debugging due to garbage pointers.

Priyank Bolia
Not the question I am asking. I agree, safety is paramount, however I want to know whether it will have a performance impact.
Konrad
I guess everyone knows there will be a negligible performance impact due to memory copy.
Priyank Bolia
+4  A: 

The rule is that variables should be set before they're used.

You do not have to explicitly initialize them on creation if you know you will be setting them elsewhere before use.

For example, the following code is perfectly okay:

int main (void) {
    int a[1000];
    : :
    for (int i =0; i < sizeof(a)/sizeof(*a); i++)
        a[i] = i;
    : :
    // Now use a[whatever] here.
    : :
    return 0;
}

In that case, it's wasteful to initialize the array at the point of its creation.

As to whether there's a performance penalty, it depends partially on where your variable is defined and partially on the execution environment.

The C standard guarantees that variables defined with static storage duration (either at file level or as statics in a function) are first initialized to a bit pattern of all zeros, then set to their respective initialized values.

It does not mandate how that second step is done. A typical way is to just have the compiler itself create the initialized variable and place it in the executable so that it's initialized by virtue of the fact that the executable is loaded. This will have no performance impact (for initialization, obviously it will have some impact for program load).

Of course, an implementation may wish to save space in the executable and initialize those variables with code (before main is called). This will have a performance impact but it's likely to be minuscule.

As to those variables with automatic storage duration (local variables and such), they're never implicitly initialized unless you assign something to them, so there will also be a performance penalty for that. By "never implicitly initialized", I mean the code segment:

void x(void) {
    int x[1000];
    ...
}

will result in x[] having indeterminate values. But since:

void x(void) {
    int x[1000] = {0};
}

may simply result in a 1000-integer memcpy-type operation (more likely memset for that case), this will likely to be fast as well. You just need to keep in mind that the initialization will happen every time that function is called.

paxdiablo
Hmm, I've never read in the draft Standard about this two-phase initialization. All that 6.7.8/10 says "If an object that has static storage duration is not initialized explicitly, then ... - if it has pointer type, it is initialized to a null pointer ...". Can you please gimme some hints where i can find it saying they are all-bit-zero initialized first? Thanks mate.
Johannes Schaub - litb
Even if it did say that, I reckon the implementation could "as-if" its way out of it, if it wanted to.
Steve Jessop
@JS, s5.1.2 Execution Environments (c1x,n1362): Two execution environments are defined: freestanding and hosted. In both cases, program startup occurs when a designated C function is called by the execution environment. Before program startup, the storage area that holds all objects with static storage duration shall first be cleared (all bytes set to zero), then the objects shall be initialized (set to their initial values). The manner and timing of such initialization are otherwise unspecified. Program termination returns control to the execution environment.
paxdiablo
A: 

For large array a performance impact may be significant. Initialization of all variables by default actually doesn't offer many benefits. It's not a solution for bad code, moreover it might hide actual issues which can be caught be compiler otherwise. You need to keep track of state of all variables in their whole lifespan to make your code reliable anyway.

+2  A: 

Measure!

#include <stdio.h>
#include <time.h>

int main(void) {
  clock_t t0;
  int k;

  t0 = clock();
  for (k=0; k<1000000; k++) {
    int a[1000];
    a[420] = 420;
  }
  printf("Without init: %f secs\n", (double)(clock() - t0) / CLOCKS_PER_SEC);

  t0 = clock();
  for (k=0; k<1000000; k++) {
    int a[1000] = {0};
    a[420] = 420;
  }
  printf("   With init: %f secs\n", (double)(clock() - t0) / CLOCKS_PER_SEC);

  return 0;
}
$ gcc measure.c
$ ./a.out
Without init: 0.000000 secs
   With init: 0.280000 secs
$ gcc -O2 measure.c
$ ./a.out
Without init: 0.000000 secs
   With init: 0.000000 secs
pmg
You're measuring variable initialization on the stack. This is different than a global variable initializer. Though, it's not clear which the poster is using.
spoulson
Also initialising to '0' not '\0' there. Not sure if that would make a difference..
Konrad
Point is, if the poster is worried with a few nanoseconds he should measure before determining where to improve the code. I'm 99.9999% sure it won't be the initialization that matters.
pmg
Whyamistilltyping, he's initializing with `0`, not `'0'`. The former is equivalent to `'\0'`, the latter isn't.
avakar
... also note that you can leave the zero out entirely: `int a[1000] = {};`.
avakar
What have you actually measured in the -O2 case? Probably that the optimizer can entirely remove code that has no effect - so what? How is that helpful or relevant?
Dipstick
@avakar: that's an extension your compiler provides when it is invoked in a non-conforming mode: `int a[1000] = {};` requires a diagnostic (and may halt compilation) from a standard compliant compiler.
pmg
@chrisharris: the question is `Does NULLing the entire array create a performance impact over using the array without initialising it?`. And the answer is `It depends. Measure your code under your conditions.`
pmg
@avakar: I'm talking about C. I noticed the `C++` tag now; I don't know if an empty initializer list is allowed in `C++`.
pmg
@pmg, well it's the other way around: C++ allows it, C forbids it.
Johannes Schaub - litb
That's what I meant, @Johannes. I know C forbids it; I wasn't sure about C++
pmg
Oh i see now. Nvm xD
Johannes Schaub - litb
pmg, my bad, I didn't notice the C tag. The question has C++ in the title and I didn't look at the tags at all.
avakar
+10  A: 

I assume a stack initialization because static arrays are auto-initialized.
G++ output

   char whatever[2567] = {'\0'};
   8048530:       8d 95 f5 f5 ff ff       lea    -0xa0b(%ebp),%edx
   8048536:       b8 07 0a 00 00          mov    $0xa07,%eax
   804853b:       89 44 24 08             mov    %eax,0x8(%esp)
   804853f:       c7 44 24 04 00 00 00    movl   $0x0,0x4(%esp)
   8048546:       00 
   8048547:       89 14 24                mov    %edx,(%esp)
   804854a:       e8 b9 fe ff ff          call   8048408 <memset@plt>

So, you initialize with {'\0'} and a call to memset is done, so yes, you have a performance hit.

Arkaitz Jimenez
(+1), but what if the array is non-static?
Konrad
The example is precisely for non-static arrays, stack based arrays. If with non-static you mean unknown size(like in C99) in compile time the code would be more or less the same as memset is always called to null it.
Arkaitz Jimenez
+7  A: 

If the variable is a global or static, then its data is typically stored verbatim in the compiled executable. So, your char arrBuffer[1024] will increase executable size by 1024 bytes. Initializing it will ensure the executable contains your data instead of the default 0's or whatever the compiler chooses. When the program starts, no processing is required to initialize the variables.

On the other hand, variables on the stack, such as non-static local function variables, are not stored in the executable the same way. Instead, on function entry the space is allocated on the stack and a memcpy places the data into the variable, thereby impacting performance.

spoulson
A: 

To answer your question: it might have a performance impact. It's possible that a compiler could detect that the values of the array were unused and just not do them. It's possible.

I personally think this is a matter of personal style. I'm tempted to say: leave it uninitialised, and use a Lint-like tool to tell you if you're using it uninitialised, which is surely a bug (as opposed to using the default value and not being told, which is also a bug, but a silent one).

Kaz Dragon
A: 

I consider that it is a bad advice to require all variables to be default initialized at the time of declaration. In most cases it is unnecessary and carries performance penalty.

For example, I often use the code below to convert a number to a string:

char s[24];
sprintf(s, "%d", int_val);

I won't write:

char s[24] = "\0";
sprintf(s, "%d", int_val);

Modern compilers are able to tell if a variable is used without being initialized.

Sherwood Hu
A: 

Your variables should be initialized to a meaningful value. Blindly and naively setting everything to zero isn't much better than leaving it uninitialized. It might make invalid code crash, instead of behaving unpredictably, but it won't make the code correct.

If you just naively zero out the array when creating it just to avoid uninitialized variables, it is still logically uninitialized. it doesn't yet have a value that is meaningful in your application.

If you're going to initialize variables (and you should), give them values that make sense in your application. Does the rest of your code expect the array to be zero initially? If so, set it to zero. Otherwise set it to some other meaningful value.

Or if the rest of your code expects to write to the array, without first reading to it, then by all means leave it uninitialized.

jalf
A: 

Personally I'm against initializing an array at created. Consider the following two pieces of code.

char buffer[1024] = {0};
for (int i = 0; i < 1000000; ++i)
{
  // Use buffer
}

vs.

for (int i = 0; i < 1000000; ++i)
{
  char buffer[1024] = {0};
  // Use buffer
}

In the first example why bother initializing buffer since the second time around the loop buffer is no longer 0 initialized? My use of buffer must work without it being initialized for all but the first iteration. All the initialization does is consume time, bloat the code and obscure bugs if typically I only go through the loop once.

While I could certainly re-factor the code as the second example, do I really want to zero initialize a buffer inside a loop if I could re-write my code so it was not necessary?

I suspect most compiler these days have options to fill uninitialized variables with non 0 values. We run all our debug builds this way so help detect use of uninitialized variables, and in release mode we turn off the option so the variables is truly uninitialized. As Sherwood Hu said, some compilers can inject code to help detect use of uninitialised variables.

Edit: In the code above I'm initializing buffer to the value 0, (not the character '0'), which is equivalent to initializing it with '\0'.

To further clarify my first code snippet, imagine the following contrived example.

char buffer[1024] = {0};
for (int i = 0; i < 1000000; ++i)
{
  // Buffer is 0 initialized, so it is fine to call strlen
  int len = strlen (buffer);
  memset (buffer, 'a', 1024);
}

The first time through the loop the buffer is initialized to 0, so strlen will return 0. The second time through the loop the buffer is no longer initialized to 0, and in fact does not contain a single 0 character, so the behaviour of strlen is undefined.

Since you have agreed with me that if buffer is initialied, moving buffer inside of the loop is not advisable, and I've showing that initializing it outside the loop offers no protection, why initialize it at all?

Stephen Nutt
Firstly I am initialising to null '\0' not '0' which is useful for a variety of reasons.Secondly, initialising does not incur the same performance hit as assignment. Your second code snippet, I agree, is unadvisable.I also don't understand your reasoning about the 'second time around'. I don't really think you have explained your point 0 could you elaborate.. ?
Konrad
I've edited my post to clarify one of the examples.
Stephen Nutt