Every byte must be visited and checked so the opportunities for optimistation seem to be limited. I can think of two possibilities:
Do you know anything about the likelyhood of variability? For example is there a reason to suppose that differences are more likely at one end of the buffer or the other. You could examine some sample data input statistically and see whether there's any benefit in starting the comparison at one end or the other.
Another possibility: can you work in ints or longs? In C you could play pointer tricks to treat 4 adjacent bytes as an int, then do int comparisons rather than byte comparisons. It's not obvious that this must be quicker than 4 x as many byte comparisons, but just possbly it might be.
This is one of the few occasions where even a touch of hand-assembling might yield some benefits.