views:

1844

answers:

3

There are two ways to zero out an integer/float array:

memset(array, 0, sizeof(int)*arraysize);

or:

for (int i=0; i <arraysize; ++i)
    array[i]=0;

obviously, memset is faster for large arraysize. However, at what point is the overhead of memset actually larger than the overhead of the for loop? For example, for an array of size 5 - which would be best? The first, the 2nd, or maybe even the un-rolled version:

array[0] = 0;
array[1] = 0;
array[2] = 0;
array[3] = 0;
array[4] = 0;
+21  A: 

In all likelihood, memset() will be inlined by your compiler (most compilers treat it as an 'intrinsic', which basically means it's inlined, except maybe at the lowest optimizations or unless explicitly disabled).

For example, here are some release notes from GCC 4.3:

Code generation of block move (memcpy) and block set (memset) was rewritten. GCC can now pick the best algorithm (loop, unrolled loop, instruction with rep prefix or a library call) based on the size of the block being copied and the CPU being optimized for. A new option -minline-stringops-dynamically has been added. With this option string operations of unknown size are expanded such that small blocks are copied by in-line code, while for large blocks a library call is used. This results in faster code than -minline-all-stringops when the library implementation is capable of using cache hierarchy hints. The heuristic choosing the particular algorithm can be overwritten via -mstringop-strategy. Newly also memset of values different from 0 is inlined.

It might be possible for the compiler to do something similar with the alternative examples you gave, but I'd bet it's less likely to.

And it's grep-able and more immediately obvious at a glance what the intent is to boot (not that the loop is particularly difficult to grok either).

Michael Burr
great answer, thanks!
Claudiu
This is a great example of the often-heard "the compiler is better than you at optimizing". Few application programmers would spend this amount of attention on a single call (and if they did, their actual application would likely suffer). :)
unwind
unwind, if you think 'the compiler is better than you' is something that you should follow, check this out - http://www.liranuna.com/sse-intrinsics-optimizations-in-popular-compilers/
LiraNuna
@LiraLuna - that's a quite interesting article, but I think you'd agree that memset()/memcpy() are in a different class than SSE intrinsics in terms of how much work has gone into compiler code generation. Also, you'd only want to do the level of analysis that you did on code that's truely performance critical (or maybe as an academic exercise), and therefore worthy of your in-depth attention - not for each and every buffer copy or clear.
Michael Burr
+5  A: 

As Michael already noted, gcc and I guess most other compilers optimize this already very well. For example gcc turns this

char arr[5];
memset(arr, 0, sizeof arr);

into

movl  $0x0, <arr+0x0>
movb  $0x0, <arr+0x4>

It doesn't get any better than that...

bdijkstra
+3  A: 

There's no way of answering the question without measuring. It will depend entirely on the compiler, cpu and runtime library implementations.

memset() can be bit of a "code smell", because it can be prone to buffer overflows, parameter reversals and has the unfortunate ability of only clearing 'byte-wise'. However it's a safe bet that it will be 'fastest' in all but extreme cases.

I tend to use a macro to wrap this to avoid some of the issues:

#define CLEAR(s) memset(&(s), 0, sizeof(s))

This sidesteps the size calculations and removes the problem of swapping the length and vlaue parameters.

In short, use memset() "under the hood". Write what you intend, and let the compiler worry about optimizations. Most are incredibly good at it.

Roddy
++ Ohmygosh! U use a macro!? Better go into hiding!
Mike Dunlavey
I think you made the macro parameter (x), but used (s) in the actual body... might want to edit that.
micmoo
@micmoo - thanks - fixed. @mike re Macros: yes... however they're unavoidable in C. The C++ answer to this question would be *very* different!
Roddy
@Roddy: BTW that was sarcasm :) Regarding the current macros-are-bad religion, I'm a devout skeptic.
Mike Dunlavey
Your macro would make it hard to spot the problem with code such as this: `char* foo = malloc(80); CLEAR(foo);`
Dan Breslau