views:

3263

answers:

1

Under what circumstances should I expect memcpys to outperform assignments on modern INTEL/AMD hardware? I am using GCC 4.2.x on a 32 bit Intel platform (but am interested in 64 bit as well).

+11  A: 

You should never expect them outperform assignments. Rather, assignment will outperform memcpy's. The reason is, the compiler will use memcpy anyway when it would be faster (if you use optimize flags). If not and if the structure is reasonable small that it fits into registers, direct register manipulation could be used which wouldn't require any memory access at all.

GCC has special block-move patterns internally that figure out when to directly change registers / memory cells, or when to use the memcpy function. Note when assigning the struct, the compiler knows at compile time how big the move is going to be, so it can unroll small copies (do a move n-times in row instead of looping) for instance. Note -mno-memcpy:

-mmemcpy
-mno-memcpy
    Force (do not force) the use of "memcpy()" for non-trivial block moves.  
    The default is -mno-memcpy, which allows GCC to inline most constant-sized copies.

Who knows it better when to use memcpy than the compiler itself?

Johannes Schaub - litb
Note that the reverse can apply - in GCC at least, memcpy of a small constant size is replaced with copy instructions, and if used with a pointer to a small source and/or destination does *not* prevent one or both being optimised into registers. So: do whatever results in the simplest code.
Steve Jessop
You shouldn't expect one to outperform the other. If you have a performance problem, you should profile it, see if assignment/memcpy is the problem, and if so, try changing them to use the other, and see if that performs better. More profiling, less guesswork. ;)
jalf
That is to say, I would expect "assignments will outperform memcpy" also to be false, given that the questioner has specified a recent GCC. But assuming no cast is required, I agree with your advice to use assignment, since it results in the clearest code.
Steve Jessop
@jalf: I totally agree. Since the question was "which is faster?", not "should I care which is faster?", I think "the compiler will deal with it whichever you do" is a fair answer, even though in the big picture the true answer is probably "why are you even asking?" ;-)
Steve Jessop
note this is a non-question anyway, since assignment *is* copying memory after all. And if you don't provide block-move patterns, gcc will fallback to memcpy. but onebyone,true that would probably be the big picture answer :p anyway, what makes you say recent GCCs make assignment slower than memcpy?
Johannes Schaub - litb