views:

96

answers:

6

Hello! Is there a way to compare two blocks of memory, and know at which point they differ (memcmp() does not meet this requirement)? I wouldn't want to perform costly loops. Thanks in advance.

Regards, Neo_b

+2  A: 

Compared to whatever else you are doing, a loop is cheap: the big cost will be retrieving the data from ram (or disk!) in the first place.

Autopulated
+2  A: 

You can't avoid looping with memory comparison of more than a few bytes. Write the algorithm as you can imagine it. It's simple enough and you might be amazed how well the compiler optimizes code like this.

tenfour
+2  A: 

std::mismatch will do that for you in conjunction std::distance.

Eugen Constantin Dinca
You assumed he's using STL iterator, and moreover he need to know at which point memory differs.
Doomsday
I had std::equal first which was obviously wrong so I've corrected it.Algorithms work very nicely with pointers as well as with (full blown) iterators.
Eugen Constantin Dinca
@Doomsday: `char*` *is* an iterator type, and `mismatch` does return two iterators pointing to the difference. +1
Potatoswatter
@Doomsday: To better clarify what other commenters have said, all of the STL algorithms work with plain pointers, not just iterators.
Billy ONeal
Err, my initial comment was on std::equal, not std::mismatch. I'm fine with the mismatch solution.
Doomsday
A: 

memcmp simply does a "costly loop", byte for byte. For example, here is Microsoft's implementation:

EXTERN_C int __cdecl memcmp(const void *Ptr1, const void *Ptr2, size_t Count)
{
    INT v = 0;
    BYTE *p1 = (BYTE *)Ptr1;
    BYTE *p2 = (BYTE *)Ptr2;

    while(Count-- > 0 && v == 0) {
        v = *(p1++) - *(p2++);
    }

    return v;
}

Most other implementations do the exact same thing. For your needs, you could do something like this:

long my_memcmp(const void *Ptr1, const void *Ptr2, size_t Count)
{
    INT v = 0;
    long pos = -1;
    BYTE *p1 = (BYTE *)Ptr1;
    BYTE *p2 = (BYTE *)Ptr2;

    while(Count-- > 0 && v == 0) 
    {
        v = *(p1++) - *(p2++);
        pos++;
    }

    return pos;
}
Andrew
byte-per-byte is indeed costly. 32-bit `int` operations may even be faster than their 8-bit counterparts.
mvds
I created my own implementation (I thought I could replace it with something else eventually). My needs require up to 10 000 000 iterations. The system freezes sometimes, but it works. It also says how many bytes do not match after a first non-match occurence.
Neo_b
@Neo_b: 10 million iterations is not that much -- most any system will do that in a quarter second or less. I'd be looking at your input buffering scheme, or considering rethinking how you are attacking this problem. If you are searching for strings, for example, the Boyer Moore algorithm will probably do you better than anything here.
Billy ONeal
I'm actually checking if two images (or their parts) are identical (for my remote desktop server). No need to send what hasn't changed, right? :)
Neo_b
Wouldn't you want an algorithm that returned all changed bytes then, not just the first location of the changed bytes?
Andrew
A: 

You will always need a loop. But you could benchmark if looping by 4 bytes (cast to int*) or by 8 bytes (uint64_t or long long int) is faster than the naive per-byte solution.

Even better, depending on the length (say, >1kb) you could unroll the loop, meaning you check e.g. per 8 int/uint64_t and on a mismatch pinpoint the first differing byte.

uint64_t *bigsteps1 = (uint64_t*)m1;
uint64_t *bigsteps2 = (uint64_t*)m2;
int steps = min(m1_len,m2_len)/sizeof(uint64_t);
int i;
for ( i=0; i<steps; i+=8 )
{
    if ( bigsteps1[i]   != bigsteps2[i]
      || bigsteps1[i+1] != bigsteps2[i+1]
   /// ....
      || bigsteps1[i+7] != bigsteps2[i+7] ) break;
}

// i<steps tells if we found a difference
// end game is trivial, left as an excercise to the reader.

The loop unroll may also backfire, for you have all these +N things in there and the i+=8 as well. Benchmark to be sure.

ps also check memory alignment: this will be fastest when m1&0xff == m2&0xff == 0

mvds
Neo_b
This will be faster in some cases, but it could result in some issues. First of all, it relies on your platform having the same alignment for 64 bit integers as it does for characters, which is often not the case. (Nobody says the base of the character array has to be on an 8 byte boundary) Second, a builtin intrinsic or assembler will probably be faster. On x86, the memory alignment issue will just make things slower, and on other architectures, it will cause the processor to throw an interrupt.
Billy ONeal
mvds
Oh, right, my mistake (the 0xff without zeros before somehow illogically tricked me into thinking that m1 was a byte).
Neo_b
+1  A: 

If there was a better way of comparing two blocks of memory, memcmp would be reimplemented to do that.

Having said that often, memcmp has a default portable implementation in the standard C library but there are is often implemented by the compiler itself as a builtin function. This builtin function should be highly optimized for the target architecture.So take the library implementation with a pinch of salt.

doron