views:

158

answers:

5

Coming from a C background, I've always assumed the POD types (eg ints) were never automatically zero-initialized in C++, but it seems this was plain wrong!

My understanding is that only 'naked' non-static POD values don't get zero-filled, as shown in the code snippet. Have I got it right, and are there any other important cases that I've missed?

static int a;

struct Foo { int a;};

void test()
{
  int b;     
  Foo f;
  int *c = new(int); 
  std::vector<int> d(1);

  // At this point...
  // a is zero
  // f.a is zero
  // *c is zero
  // d[0] is zero
  // ... BUT ... b is undefined     
}  
+1  A: 

Actually some of the values being zero may be due to you trying this code in the debug version of the application (if that is the case).

If I'm not mistaken, in your code:

  • a should be uninitialized.
  • b should be uninitialized
  • c should point to a new (uninitialized) int
  • d should be initialized to [0] (as you correctly guessed)
utnapistim
I think `a` would be cleared to `0` in most Unix-like operating systems today. Technically, `a` would be in the `.bss` segment, which usually gets set to all 0's before `main()` gets called.
Mike DeSimone
Yes, but my point was that he shouldn't rely on that even if the value appears as zero in the code. Explicit initialization is the way to go here.
utnapistim
Unless `a` has been modified before calling `test()`, it will have a value of zero. Objects with static storage duration are zero-initialized when the program starts.
James McNellis
+6  A: 

Assuming you haven't modified a before calling test(), a has a value of zero, because objects with static storage duration are zero-initialized when the program starts.

d[0] has a value of zero, because the constructor invoked by std::vector<int> d(1) has a second parameter that takes a default argument; that second argument is copied into all of the elements of the vector being constructed. The default argument is T(), so your code is equivalent to:

std::vector<int> d(1, int());

You are correct that b has an indeterminate value.

f.a and *c both have indeterminate values as well. To value initialize them (which for POD types is the same as zero initialization), you can use:

Foo f = Foo();      // You could also use Foo f((Foo()))
int* c = new int(); // Note the parentheses
James McNellis
Mostly agreed; only I'm not sure why `Foo f` doesn't call the synthesized constructor, and if it does, why that would be any different from `Foo f=Foo();`...
xtofl
Thanks. And double thanks for the awesome link in your comment!
Roddy
@xtofl: If no initializer is present (as is the case with `Foo f;`) for a POD type object, then that object is left uninitialized. `Foo f = Foo();` creates a value-initialized `Foo` (that's what the `Foo()` part does) and then uses that to initialize `f`.
James McNellis
A: 

For me, POD types are initialized depending on the part of the memory they are placed. Your static int a is allocated on the data segment, so it has a default value on startup. However, I think f is not inizialized in your example...

kbok
A: 

They don't. Debug bits versions might do this, but typically it's just placed into memory, and initialized to whatever happened to be the value in memory.

Alan
A: 

Note that the zero-initialization done by the OS as a security feature is usually only done the first time memory is allocated. By that I mean any segment in the heap, stack, and data sections. The stack and data sections are typically of fixed size, and are initialized when the application is loaded into memory.

The data segment (containing static/global data and code) typically doesn't get "re-used", although that may not be the case if you dynamically load code at runtime.

The memory in the stack segment gets re-used all the time. Local variables, function stack frames, etc.. are all being constantly used and re-used and are not initialized every time - just when the application is first loaded.

However, when the application makes requests for heap memory, the memory manager will typically zero-initialize segments of memory before granting the request, but only for new segments. If you make a request for heap memory, and there is free space in a segment that was already initialized, the initialization isn't done a second time. Therefor, there is no guarantee that if that particular segment of memory is re-used by your application, it will get zero-initialized again.

So, for example, if you allocate a Foo on the heap, assign its field a value, delete the Foo instance, and then create a new Foo on the heap, there is a chance that the new Foo will be allocated in the same exact memory location as the old Foo, and so its field will initially have the same value as the old Foo's field.

If you think about it, this makes sense, because the OS is only initializing the data to prevent one application from accessing the data from another application. There is less risk in allowing an application access to its own data, so for performance reasons the initialization isn't done every time - just the first time a particular segment of memory is made available for use by the application (in any segment).

Sometimes when you run an application in debug mode, however, some debug mode runtimes initialize stack and heap data at every allocation (soo your Foo field will always be initialized). However, different debug runtimes initialize the data to different values. Some zero initialize, and some initialize to a "marker" value.

The point is - never ever use uninitialized values anywhere in your code. There is absolutely no guarantee that they will be zero initialized. Also, be sure to read the previously linked article regarding parens and default vs value initialization as this affects the definition of an "uninitialized" value.

Jeremy Bell