I would like to copy a relatively short sequence of memory (less than 1 KB, typically 2-200 bytes) in a time critical function. The best code for this on CPU side seems to be rep movsd
. However I somehow cannot make my compiler to generate this code. I hoped (and I vaguely remember seeing so) using memcpy would do this using compiler built-in intrinsics, but based on disassembly and debugging it seems compiler is using call to memcpy/memmove library implementation instead. I also hoped the compiler might be smart enough to recognize following loop and use rep movsd
on its own, but it seems it does not.
char *dst;
const char *src;
// ...
for (int r=size; --r>=0; ) *dst++ = *src++;
Is there some way to make the Visual Studio compiler to generate rep movsd
sequence other than using inline assembly?