double *d;
int length=10;
memset(d, length, 0);
//or
for (int i=length; i; i--)
d[i]=0.0;
views:
1166answers:
15memset(d, 10, 0) is wrong as it only nulls 10 bytes. prefer std::fill as the intent is clearest.
If you really care you should try and measure. However the most portable way is using std::fill():
std::fill( array, array + numberOfElements, 0.0 );
memset(d,0,10*sizeof(*d));
is likely to be faster. Like they say you can also
std::fill_n(d,10,0.);
but it is most likely a prettier way to do the loop.
Don't forget to compare a properly optimized for loop if you really care about performance.
Some variant of Duff's device if the array is sufficiently long, and prefix --i not suffix i-- (although most compilers will probably correct that automatically.).
Although I'd question if this is the most valuable thing to be optimising. Is this genuinely a bottleneck for the system?
I think you mean
memset(d, 0, length * sizeof(d[0]))
and
for (int i = length; --i >= 0; ) d[i] = 0;
Personally, I do either one, but I suppose std::fill()
is probably better.
Note that for memset you have to pass the number of bytes, not the number of elements because this is an old C function:
memset(d, 0, sizeof(double)*length);
memset can be faster since it is written in assembler, whereas std::fill is a template function which simply does a loop internally.
But for type safety and more readable code I would recommend std::fill().
If you're required to not use STL...
double aValues [10];
ZeroMemory (aValues, sizeof(aValues));
ZeroMemory at least makes the intent clear.
As an alternative to all stuff proposed, I can suggest you NOT to set array to all zeros at startup. Instead, set up value to zero only when you first access the value in a particular cell. This will stave your question off and may be faster.
In general the memset is going to be much faster, make sure you get your length right, obviously your example has not (m)allocated or defined the array of doubles. Now if it truly is going to end up with only a handful of doubles then the loop may turn out to be faster. But as get to the point where the fill loop shadows the handful of setup instructions memset will typically use larger and sometimes aligned chunks to maximize speed.
As usual, test and measure. (although in this case you end up in the cache and the measurement may turn out to be bogus).
Try this, if only to be cool xD
{
double *to = d;
int n=(length+7)/8;
switch(length%8){
case 0: do{ *to++ = 0.0;
case 7: *to++ = 0.0;
case 6: *to++ = 0.0;
case 5: *to++ = 0.0;
case 4: *to++ = 0.0;
case 3: *to++ = 0.0;
case 2: *to++ = 0.0;
case 1: *to++ = 0.0;
}while(--n>0);
}
}
In addition to the several bugs and omissions in your code, using memset is not portable. You can't assume that a double with all zero bits is equal to 0.0. First make your code correct, then worry about optimizing.
calloc(length, sizeof(double))
According to IEEE-754, the bit representation of a positive zero is all zero bits, and there's nothing wrong with requiring IEEE-754 compliance. (If you need to zero out the array to reuse it, then pick one of the above solutions).
According to this Wikipedia article on IEEE 754-1975 64-bit floating point a bit pattern of all 0s will indeed properly initialize a double to 0.0. Unfortunately your memset code doesn't do that.
Here is the code you ought to be using:
memset(d, 0, length * sizeof(double));
As part of a more complete package...
{
double *d;
int length = 10;
d = malloc(sizeof(d[0]) * length);
memset(d, 0, length * sizeof(d[0]));
}
Of course, that's dropping the error checking you should be doing on the return value of malloc. sizeof(d[0])
is slightly better than sizeof(double)
because it's robust against changes in the type of d.
Also, if you use calloc(length, sizeof(d[0]))
it will clear the memory for you and the subsequent memset will no longer be necessary. I didn't use it in the example because then it seems like your question wouldn't be answered.
The example will not work because you have to allocate memory for your array. You can do this on the stack or on the heap.
This is an example to do it on the stack:
double d[50] = {0.0};
No memset is needed after that.
Assuming the loop length is an integral constant expression, the most probable outcome it that a good optimizer will recognize both the for-loop and the memset(0). The result would be that the assembly generated is essentially equal. Perhaps the choice of registers could differ, or the setup. But the marginal costs per double should really be the same.