ansaurus

Question

clearing a small integer array: memset vs. for loop

Answer 1

+21 A:

In all likelihood, memset() will be inlined by your compiler (most compilers treat it as an 'intrinsic', which basically means it's inlined, except maybe at the lowest optimizations or unless explicitly disabled).

For example, here are some release notes from GCC 4.3:

Code generation of block move (memcpy) and block set (memset) was rewritten. GCC can now pick the best algorithm (loop, unrolled loop, instruction with rep prefix or a library call) based on the size of the block being copied and the CPU being optimized for. A new option -minline-stringops-dynamically has been added. With this option string operations of unknown size are expanded such that small blocks are copied by in-line code, while for large blocks a library call is used. This results in faster code than -minline-all-stringops when the library implementation is capable of using cache hierarchy hints. The heuristic choosing the particular algorithm can be overwritten via -mstringop-strategy. Newly also memset of values different from 0 is inlined.

It might be possible for the compiler to do something similar with the alternative examples you gave, but I'd bet it's less likely to.

And it's grep-able and more immediately obvious at a glance what the intent is to boot (not that the loop is particularly difficult to grok either).

Michael Burr 2009-07-15 21:16:16

great answer, thanks!

Claudiu 2009-07-15 21:39:26

This is a great example of the often-heard "the compiler is better than you at optimizing". Few application programmers would spend this amount of attention on a single call (and if they did, their actual application would likely suffer). :)

unwind 2009-07-16 08:59:30

unwind, if you think 'the compiler is better than you' is something that you should follow, check this out - http://www.liranuna.com/sse-intrinsics-optimizations-in-popular-compilers/

LiraNuna 2009-07-30 21:55:02

@LiraLuna - that's a quite interesting article, but I think you'd agree that memset()/memcpy() are in a different class than SSE intrinsics in terms of how much work has gone into compiler code generation. Also, you'd only want to do the level of analysis that you did on code that's truely performance critical (or maybe as an academic exercise), and therefore worthy of your in-depth attention - not for each and every buffer copy or clear.

Michael Burr 2009-07-30 22:44:14

Answer 2

+5 A:

As Michael already noted, gcc and I guess most other compilers optimize this already very well. For example gcc turns this

char arr[5];
memset(arr, 0, sizeof arr);

into

movl  $0x0, <arr+0x0>
movb  $0x0, <arr+0x4>

It doesn't get any better than that...

bdijkstra 2009-07-15 21:54:05

Answer 3

+3 A:

There's no way of answering the question without measuring. It will depend entirely on the compiler, cpu and runtime library implementations.

memset() can be bit of a "code smell", because it can be prone to buffer overflows, parameter reversals and has the unfortunate ability of only clearing 'byte-wise'. However it's a safe bet that it will be 'fastest' in all but extreme cases.

I tend to use a macro to wrap this to avoid some of the issues:

#define CLEAR(s) memset(&(s), 0, sizeof(s))

This sidesteps the size calculations and removes the problem of swapping the length and vlaue parameters.

In short, use memset() "under the hood". Write what you intend, and let the compiler worry about optimizations. Most are incredibly good at it.

Roddy 2009-07-15 22:03:20

++ Ohmygosh! U use a macro!? Better go into hiding!

Mike Dunlavey 2009-07-16 00:44:48

I think you made the macro parameter (x), but used (s) in the actual body... might want to edit that.

micmoo 2009-07-16 01:19:23

@micmoo - thanks - fixed. @mike re Macros: yes... however they're unavoidable in C. The C++ answer to this question would be *very* different!

Roddy 2009-07-16 08:41:37

@Roddy: BTW that was sarcasm :) Regarding the current macros-are-bad religion, I'm a devout skeptic.

Mike Dunlavey 2009-07-17 18:26:21

Your macro would make it hard to spot the problem with code such as this: `char* foo = malloc(80); CLEAR(foo);`

Dan Breslau 2009-07-30 21:44:34

ansaurus

tags:

views:

answers:

clearing a small integer array: memset vs. for loop

related questions