I'm not sure how you do the XOR bit (at least it should be slow; I don't think any of current GPUs accelerate that), but here's my idea:
- have two input images
- turn on occlusion query.
- draw the two images to the screen (i.e. full screen quad with two textures set up), with a fragment shader that computes abs(texel1-texel2), and kills the pixel (discard in GLSL) if the pixels are the same (difference is zero or below some threshold). Easiest is probably just using a GLSL fragment shader, and there you just read two textures, compute abs() of the difference and discard the pixel. Very basic GLSL knowledge is enough here.
- get number of pixels that passed the query. For pixels that are the same, the query won't pass (pixels will be discarded by the shader), and for pixels that are different, the query will pass.
At first I though of a more complex approach that involves depth buffer, but then realized that just killing pixels should be enough. Here's my original though (but the above one is simpler and more efficient):
- have two input images
- clear screen and depth buffer
- draw the two images to the screen (i.e. full screen quad with two textures set up), with a fragment shader that computes abs(texel1-texel2), and kills the pixel (discard in GLSL) if the pixels are different. Draw the quad so that it's depth buffer value is something close to near plane.
- after this step, depth buffer will contain small depth values for pixels that are the same, and large (far plane) depth values for pixels that are different.
- turn on occlusion query, and draw another full screen quad with depth closer than far plane, but larger than the previous quad.
- get number of pixels that passed the query. For pixels that are the same, the query won't pass (depth buffer is already closer), and for pixels that are different, the query will pass. You'd use SAMPLES_PASSED_ARB to get this. There's an occlusion query example at CodeSampler.com to get your started.
Of course all this requires GPU with occlusion query support. Most GPUs since 2002 or so do support that, with exception of some low-end ones (in particular, Intel 915 (aka GMA 900) and Intel 945 (aka GMA 950)).