Hello! Is there a way to compare two blocks of memory, and know at which point they differ (memcmp() does not meet this requirement)? I wouldn't want to perform costly loops. Thanks in advance.
Regards, Neo_b
Hello! Is there a way to compare two blocks of memory, and know at which point they differ (memcmp() does not meet this requirement)? I wouldn't want to perform costly loops. Thanks in advance.
Regards, Neo_b
Compared to whatever else you are doing, a loop is cheap: the big cost will be retrieving the data from ram (or disk!) in the first place.
You can't avoid looping with memory comparison of more than a few bytes. Write the algorithm as you can imagine it. It's simple enough and you might be amazed how well the compiler optimizes code like this.
std::mismatch will do that for you in conjunction std::distance.
memcmp simply does a "costly loop", byte for byte. For example, here is Microsoft's implementation:
EXTERN_C int __cdecl memcmp(const void *Ptr1, const void *Ptr2, size_t Count)
{
INT v = 0;
BYTE *p1 = (BYTE *)Ptr1;
BYTE *p2 = (BYTE *)Ptr2;
while(Count-- > 0 && v == 0) {
v = *(p1++) - *(p2++);
}
return v;
}
Most other implementations do the exact same thing. For your needs, you could do something like this:
long my_memcmp(const void *Ptr1, const void *Ptr2, size_t Count)
{
INT v = 0;
long pos = -1;
BYTE *p1 = (BYTE *)Ptr1;
BYTE *p2 = (BYTE *)Ptr2;
while(Count-- > 0 && v == 0)
{
v = *(p1++) - *(p2++);
pos++;
}
return pos;
}
You will always need a loop. But you could benchmark if looping by 4 bytes (cast to int*) or by 8 bytes (uint64_t
or long long int
) is faster than the naive per-byte solution.
Even better, depending on the length (say, >1kb) you could unroll the loop, meaning you check e.g. per 8 int/uint64_t and on a mismatch pinpoint the first differing byte.
uint64_t *bigsteps1 = (uint64_t*)m1;
uint64_t *bigsteps2 = (uint64_t*)m2;
int steps = min(m1_len,m2_len)/sizeof(uint64_t);
int i;
for ( i=0; i<steps; i+=8 )
{
if ( bigsteps1[i] != bigsteps2[i]
|| bigsteps1[i+1] != bigsteps2[i+1]
/// ....
|| bigsteps1[i+7] != bigsteps2[i+7] ) break;
}
// i<steps tells if we found a difference
// end game is trivial, left as an excercise to the reader.
The loop unroll may also backfire, for you have all these +N things in there and the i+=8 as well. Benchmark to be sure.
ps also check memory alignment: this will be fastest when m1&0xff == m2&0xff == 0
If there was a better way of comparing two blocks of memory, memcmp would be reimplemented to do that.
Having said that often, memcmp has a default portable implementation in the standard C library but there are is often implemented by the compiler itself as a builtin function. This builtin function should be highly optimized for the target architecture.So take the library implementation with a pinch of salt.