views:

1713

answers:

9

I heard a saying that c++ programmers should avoid memset,

class ArrInit {
    //! int a[1024] = { 0 };
    int a[1024];
public:
    ArrInit() {  memset(a, 0, 1024 * sizeof(int)); }
};

so considering the code above,if you do not use memset,how could you make a[1..1024] filled with zero?Whats wrong with memset in C++?

thanks.

+16  A: 

Zero-initializing should look like this:

class ArrInit {
    int a[1024];
public:
    ArrInit(): a() { }
};

As to using memset, there are a couple of ways to make the usage more robust (as with all such functions): avoid hard-coding the array's size and type:

memset(a, 0, sizeof(a));

For extra compile-time checks it is also possible to make sure that a indeed is an array (so sizeof(a) would make sense):

template <class T, size_t N>
size_t array_bytes(const T (&)[N])  //accepts only real arrays
{
    return sizeof(T) * N;
}

ArrInit() { memset(a, 0, array_bytes(a)); }

But for non-character types, I'd imagine the only value you'd use it to fill with is 0, and zero-initialization should already be available in one way or another.

UncleBens
Why `unsigned` as the function's return type? Why not `size_t`?
ndim
what if want to initialize the array with non-zero?
Jichao
You can put any value you want inside the braces (e.g. ArrInit(): a() {5}) and it will initialize the array with that value.
Pace
You do realize that all I have to do is change `int` in your example to some class with a virtual function, and your code is likely to wipe out the vptr, don't you? You're explaining how to cause disasters in a slightly safer way.
David Thornley
@Pace: No, you'll get a syntax error. Those braces are the ones delimiting the body of the constructor function. Even with actual array initialization syntax: "int a[1024] = { 5 };" only the elements you list will be initialized, so in this example, only a[0] will be 5, not the entire array.
Dewayne Christensen
A: 

In C++ you should use new. In the case with simple arrays like in your example there is no real problem with using it. However, if you had an array of classes and used memset to initialize it, you woudn't be constructing the classes properly.

Consider this:

class A {
    int i;

    A() : i(5) {}
}

int main() {
    A a[10];
    memset (a, 0, 10 * sizeof (A));
}

The constructor for each of those elements will not be called, so the member variable i will not be set to 5. If you used new instead:

 A a = new A[10];

than each element in the array will have its constructor called and i will be set to 5.

Casey
I missed the question about initializing it to zero, and was focused on difference between memset and new.
Casey
@Casey:`A a[1]` in my g++ compiler does call the constructor,and the memeber variable i will be set to 5.
Jichao
`A a[10] = new A[10];` is not valid C++. You seem to be confusing C++ with another language.
Roger Pate
A: 

Your code is fine. I thought the only time in C++ where memset is dangerous is when you do something along the lines of:
YourClass instance; memset(&instance, 0, sizeof(YourClass);.

I believe it might zero out internal data in your instance that the compiler created.

ruibm
+19  A: 

The issue is not so much using memset() on the built-in types, it is using them on class (aka non-POD) types. Doing so will almost always do the wrong thing and frequently do the fatal thing - it may, for example, trample over a virtual function table pointer.

anon
@Neil, Thanks! I was also wondering how could doing memset on built in types cause any problem at all.
Jay
could you add an example where using memset is wrong?
Otto Allmendinger
Using memset on any class with a virtual function is likely to be bad.
David Thornley
@Otto:because sizeof(class) would treat virtual function table pointer as one data member.
Jichao
Or on any class that contains a non-pod type, such as a string
anon
What does POD mean?
toto
`memset` is also problematic when used on some POD types, like pointers and floating point types. Setting all the bytes to 0 will not portably set pointers to NULL or floating point types to 0.0.
Adrian McCarthy
@toto: POD stands for "Plain Old Data". Essentially it refers to built-in types or structs or unions of built-in types. If you can declare it in C, it's probably a POD in C++.
Adrian McCarthy
POD means "plain old data" types without (non-trivial) constructors or destructors.
anon
C++ already has a generic replacement for memset: `std::fill`. So yes, a C++ programmer should avoid memset.
jalf
so is virtual function pointer like an implicit member of a class?
theactiveactor
+1  A: 

It is "bad" because you are not implementing your intent.

Your intent is to set each value in the array to zero and what you have programmed is setting an area of raw memory to zero. Yes, the two things have the same effect but it's clearer to simply write code to zero each element.

Also, it's likely no more efficient.

class ArrInit
{
public:
    ArrInit();
private:
    int a[1024];
};

ArrInit::ArrInit()
{
    for(int i = 0; i < 1024; ++i) {
        a[i] = 0;
    }
}


int main()
{
    ArrInit a;
}

Compiling this with visual c++ 2008 32 bit with optimisations turned on compiles the loop to -

; Line 12
    xor eax, eax
    mov ecx, 1024    ; 00000400H
    mov edi, edx
    rep stosd

Which is pretty much exactly what the memset would likely compile to anyway. But if you use memset there is no scope for the compiler to perform further optimisations, whereas by writing your intent it's possible that the compiler could perform further optimisations, for example noticing that each element is later set to something else before it is used so the initialisation can be optimised out, which it likely couldn't do nearly as easily if you had used memset.

John Burton
I understand of course that a default initializer will zero the array too, so this is just an example but the point stands, implement your requirements, which in this case is to set each array element to zero, rather than some other method to achieve the results unless it's the only way you can achieve other requirements such as performance
John Burton
+19  A: 

In C++ std::fill or std::fill_n may be a better choice, because it is generic and therefore can operate on objects as well as PODs. However, memset operates on a raw sequence of bytes, and should therefore never be used to initialize non-PODs. Regardless, optimized implementations of std::fill may internally use specialization to call memset if the type is a POD.

Charles Salvia
I forgot about std::fill so +1 to this from me. Yes, there is a c++ function specifically designed to fill containers so use it!
John Burton
whats the meaning of POD?
Jichao
http://en.wikipedia.org/wiki/Plain_old_data_structures
Pukku
+3  A: 

What's wrong with memset in C++ is mostly the same thing that's wrong with memset in C. memset fill memory region with physical zero-bit pattern, while in reality in virtually 100% of cases you need to fill an array with logical zero-values of corresponding type. In C language, memset is only guaranteed to properly initialize memory for integer types (and its validity for all integer types, as opposed to just char types, is a relatively recent guarantee added to C language specification). It is not guaranteed to properly set to zero any floating point values, it is not guaranteed to produce proper null-pointers.

Of course, the above might be seen as excessively pedantic, since the additional standards and conventions active on the given platform might (and most certainly will) extend the applicability of memset, but I would still suggest following the Occam's razor principle here: don't rely on any other standards and conventions unless you really really have to. C++ language (as well a C) offers several language-level features that let you safely initialize your aggregate objects with proper zero values of proper type. Other answers already mentioned these features.

AndreyT
+1  A: 

In addition to badness when applied to classes, memset is also error prone. It's very easy to get the arguments out-of-order, or to forget the sizeof portion. The code will usually compile with these errors, and quietly do the wrong thing. The symptom of the bug might not manifest until much later, making it difficult to track down.

memset is also problematic with lots of plain types, like pointers and floating point. Some programmers set all bytes to 0, assuming the pointers will then be NULL and floats will be 0.0. That's not a portable assumption.

Adrian McCarthy
Setting pointers and floating-point numbers to binary zero usually works, but I wouldn't want to get into the habit. Still, the IEEE floating-point standard gets more and more entrenched, and that interprets all-bits-zero as 0.0.
David Thornley
@David: Yup, it usually works, but someday you'll be on a platform where it doesn't.
Adrian McCarthy
A: 

There's no real reason to not use it except for the few cases people pointed out that no one would use anyway, but there's no real benefit to using it either unless you are filling memguards or something.

Charles Eli Cheese