tags:

views:

443

answers:

11

Oftentimes data structures' valid initialization is to set all members to zero. Even when programming in C++, one may need to interface with an external API for which this is the case.

Is there any practical difference between:

some_struct s;
memset(&s, 0, sizeof(s));

and simply

some_struct s = { 0 };

Do folks find themselves using both, with a method for choosing which is more appropriate for a given application? (Hopefully it is understood that this is only currently applicable to POD structures; you'd get all sorts of havoc if there was a C++ std::string in that structure.)

For myself, as mostly a C++ programmer who doesn't use memset much, I'm never certain of the function signature so I find the second example is just easier to use in addition to being less typing, more compact, and maybe even more obvious since it says "this object is initialized to zero" right in the declaration rather than waiting for the next line of code and seeing, "oh, this object is zero initialized."

When creating classes and structs in C++ I tend to use initialization lists; I'm curious about folks thoughts on the two "C style" initializations above rather than a comparison against what is available in C++ since I suspect many of us interface with C libraries even if we code mostly in C++ ourselves.

Edit: Neil Butterworth posed this question, in followup, that I believe is an interesting corollary to this question.

+14  A: 

some_struct s = { 0 }; is guaranteed to work; memset relies on implementation details and is best avoided.

Mark Ransom
If this is about C++, then no, it will *not* work when the first member of `some_struct` is a enum object. This is one of the reasons the `= {}` initializers were introduced in C++.
AndreyT
+5  A: 

If the struct contains pointers, the value of all bits zero as produced by memset may not mean the same as assigning a 0 to it in the C (or C++) code, i.e. a NULL pointer.

(It might also be the case with floats and doubles, but that I've never encountered. However, I don't think the standards guarantee them to become zero with memset either.)

Edit: From a more pragmatic perspective, I'd still say to not use memset when possible to avoid, as it is an additional function call, longer to write, and (in my opinion) less clear in intent than = { 0 }.

Arkku
The C and C++ standards don't require all bits zero to turn into 0.0 as a floating point number, but the IEEE standards do. There are machines that don't follow the IEEE standards, but all of them of which I'm aware convert all bits zero to 0.0 anyway.
Jerry Coffin
The question asked for *practical* differences. Practically speaking, null pointers are always all-bits-zero, and so are the zero values of the floating-point types.
Rob Kennedy
On the platforms with which I am familiar, all use all-bits-zero to mean NULL, 0, and 0.0, but your point is well taken.
dash-tom-bang
@Rob Kennedy: Well, there are machines with non-zero representations for null pointers so it is a practical difference on those machines. Of course, we may consider such machines impractical. =)
Arkku
@Rob Kennedy: Incorrect. Pointers-to-data-members, like `int S::*p` for example usually *never* use all-zero bit pattern for null pointers. They use all-one bit pattern instead (0xFFF...), which is one *practical* example when you will not obtain a null-pointer with `memset(..., 0, ...)`. The funny part is that this is the case on our everyday machines and implementations: G++, MSVC++. It's been living under our noses all this time. Nothing exotic about it.
AndreyT
+4  A: 

Depending on the compiler optimization, there may be some threshold above which memset is faster, but that would usually be well above the normal size of stack based variables. Using memset on a C++ object with a virtual table is of course bad.

drawnonward
Using `memset` on any non-POD is bad. Also, if `memset` is faster for initializing something, I don't see why the compiler wouldn't then optimize `={}` to such a thing.
GMan
A: 

The bzero function is another option.

#include <strings.h>
void bzero(void *s, size_t n);
maerics
Thomas Matthews
+2  A: 

The only practical difference is that the ={0}; syntax is a bit clearer about saying "initialize this to be empty" (at least it seems clearer to me).

Purely theoretically, there are a few situations in which memset could fail, but as far as I know, they really are just that: theoretical. OTOH, given that it's inferior from both a theoretical and a practical viewpoint, I have a hard time figuring out why anybody would want to use memset for this task.

Jerry Coffin
`memset` fails over `={}` in real environments. (I don't know what the names are, but I know they exist. And they are modern.)
GMan
@GMan: I'm aware of a few environments in which the usual representation of a null pointer does not have all bits zero -- but at least in those of which I'm aware, all-bits-zero *does* create a null pointer as well. Of course, there are machines I haven't used. As I said, even though I can't point at it failing, I can't conceive of a good reason to use `memset` either.
Jerry Coffin
AndreyT
+1  A: 

I've never understood the mysterious goodness of setting everything to zero, which even if it is defined seems unlikely to be desirable. As this is tagged as C++, the correct solution to initialisation is to give the struct or class a construtor.

anon
Seems like a tough row to hoe when you're dealing with 3rd-party libraries written in C, to which you don't have the source code.
dash-tom-bang
It's less arbitrary if your struct contains pointers owned by the struct, since then a cleanup function can call `free` (or `delete`) unconditionally.
jamesdlin
+1  A: 

I think the initialization speaks much clearer what you actually are doing. You are initializing the struct. When the new standard is out that way of initializing will get even more used (initializing containers with {} is something to look forward to). The memset way are slightly more error prone, and does not communicate that clearly what you are doing. That might not account for much while programming alone, but means a great deal when working in a team.

For some people working with c++, memset, malloc & co. are quite esoteric creatures. I have encountered a few myself.

daramarak
I agree- I can't wait, with the asterisk that I'm worried about what the compiler vendors drop on us when that happens. (Speaking as someone who works on "esoteric" platforms with a small market for compiler vendors.)
dash-tom-bang
+1  A: 

The best method for clearing structures is to set each field individually:

struct MyStruct
{
  std::string name;
  int age;
  double checking_account_balance;
  void clear(void)
  {
     name.erase();
     age = 0;
     checking_account_balance = 0.0;
  }
};

In the above example, a clear method is defined to set all the members to a known state or value. The memset and std::fill methods may not work due to std::string and double types. A more robust program clears each field individually.

I prefer having a more robust program than spending less time typing.

Thomas Matthews
+2  A: 

Hopefully it is understood that this is only currently available for POD structures; you'd get a compiler error if there was a C++ std::string in that structure.

No you won't. If you use memset on such, at the best you will just crash, and at the worst you get some gibberish. The = { } way can be used perfectly fine on non-POD structs, as long as they are aggregates. The = { } way is the best way to take in C++. Please note that there is no reason in C++ to put that 0 in it, nor is it recommended, since it drastically reduces the cases in which it can be used

struct A {
  std::string a;
  int b;
};

int main() {
  A a = { 0 };
  A a = { };
}

The first will not do what you want: It will try to create a std::string from a C-string given a null pointer to its constructor. The second, however, does what you want: It creates an empty string.

Johannes Schaub - litb
Ahh fair enough; I shall delete that edit. Letting my memory of "non-aggregates cannot be initialized with initializer list" errors cloud my thinking... Of course, any object which has a single argument (non-explicit?) ctor will have this behavior. In the case of std::string, happily doing string ops from NULL.
dash-tom-bang
+8  A: 

memset is practically never the right way to do it. And yes, there is a practical difference (see below).

In C++ not everything can be initialized with literal 0 (objects of enum types can't be), which is why in C++ the common idiom is

some_struct s = {};

while in C the idiom is

some_struct s = { 0 };

Note, that in C the = { 0 } is what can be called the universal zero initializer. It can be used with objects of virtually any type, since the {}-enclosed initializers are allowed with scalar objects as well

int x = { 0 }; /* legal in C (and in C++) */

which makes the = { 0 } useful in generic type-independent C code (type-independent macros for example).

The drawback of = { 0 } initializer in C89/90 and C++ is that it can only be used as a part of declaration. (C99 fixed this problem by introducing compound literals. Similar functionality is coming to C++ as well.) For this reason you might see many programmers use memset in order to zero something out in the middle of C89/90 or C++ the code. Yet, I'd say that the proper way to do is still without memset but rather with something like

some_struct s;
...
{
  const some_struct ZERO = { 0 };  
  s = ZERO;
}
...

i.e. by introducing a "fictive" block in the middle of the code, even though it might not look too pretty at the first sight. Of course, in C++ there's no need to introduce a block.

As for the practical difference... You might hear some people say that memset will produce the same results in practice, since in practice the physical all-zero bit pattern is what is used to represent zero values for all types. However, this is generally not true. An immediate example that would demonstrate the difference in a typical C++ implementation is a pointer-to-data-member type

struct S;
...

int S::*p = { 0 };
assert(p == NULL); // this assertion is guaranteed to hold

memset(&p, 0, sizeof p);
assert(p == NULL); // this assertion will normally fail

This happens because a typical implementation usually uses the all-one bit pattern (0xFFFF...) to represent the null pointer of this type. The above example demonstrates a real-life practical difference between a zeroing memset and a normal = { 0 } initializer.

AndreyT
@AndreyT: `"Of course, in C++ there's no need to introduce a block"`, Why? Also, why do you want to introduce the block in C?
Lazer
@eSKay: Because C++ allows one to add declarations in the middle of the code. There's no need for a block. C99 allows that as well, but as I said above in C99 you have a better option: compound literals. The only case that's left uncovered is C89/90. In C89/90 you can't just declare a variable in the middle of the code. You need a block. This is why I want to introduce a block in C (implying C89/90).
AndreyT
@AndreyT: thanks! never knew that "in C89/90 you can't just declare a variable in the middle of the code". But I tested [this code](http://codepad.org/nx3vzW1o) using `gcc -std=c89 check89.c`, and it compiles and runs fine!
Lazer
@eSKay: GCC is well-known for quietly taking quite a few liberties with the language. If you want GCC to at least *resemble* standard C, you need to run it with `-ansi -pedantic-errors` settings. Just `-std=c89` is not enough.
AndreyT
@AndreyT: thanks! `error: ISO C90 forbids mixed declarations and code`
Lazer
@AndreyT, interesting how different people interpret answers differently :) I thought the reason you introduced the block is to reduce visibility of `ZERO`, and that the reason you said it's not needed in C++ is because you can say `s = some_struct();` :) To get rid of other interpretational issues, the feature you say is coming to C++ is `s = { }`, right? It's the new unified initializers thing of C++0x.
Johannes Schaub - litb
@litb: I have to admit, the `s = some_struct()` variant somehow slipped my mind. Indeed, in C++ one doesn't really need that `ZERO` trick. Reducing visibility of ZERO is certainly important to keep in mind as well.
AndreyT
A: 

In C I prefer using {0,} to the equivalent memset(). However gcc warns about this usage :( Details here: http://www.pixelbeat.org/programming/gcc/auto_init.html

In C++ they're usually equivalent, but as always with C++ there are corner cases to consider (noted in other answers).

pixelbeat