I would like to compare a screenshot of one application (could be a Web page) with a previously taken screenshot to determine whether the application is displaying itself correctly. I don't want an exact match comparison, because the aspect could be slightly different (in the case of a Web app, depending on the browser, some element could be at a slightly different location). It should give a measure of how similar are the screenshots.

Is there a library / tool that already does that? How would you implement it?


Well, not to answer your question directly, but I have seen this happen. Microsoft recently launched a tool called PhotoSynth which does something very similar to determine overlapping areas in a large number of pictures (which could be of different aspect ratios).

I wonder if they have any available libraries or code snippets on their blog.

+1  A: 

You'll need pattern recognition for that. To determine small differences between two images, Hopfield nets work fairly well and are quite easy to implement. I don't know any available implementations, though.

Konrad Rudolph

I wonder (and I'm really just throwing the idea out there to be shot down) if something could be derived by subtracting one image from the other, and then compressing the resulting image as a jpeg of gif, and taking the file size as a measure of similarity.

If you had two identical images, you'd get a white box, which would compress really well. The more the images differed, the more complex it would be to represent, and hence the less compressible.

Probably not an ideal test, and probably much slower than necessary, but it might work as a quick and dirty implementation.

Matt Sheppard
+2  A: 

You might look at the code for the open source tool findimagedupes, though it appears to have been written in perl, so I can't say how easy it will be to parse...

Reading the findimagedupes page that I liked, I see that there is a C++ implementation of the same algorithm. Presumably this will be easier to understand.

And it appears you can also use PixiePlus or gqview.


to expand on Vaibhav's note, hugin is an open-source 'autostitcher' which should have some insight on the problem.


Well a really base-level method to use could go through every pixel colour and compare it with the corresponding pixel colour on the second image - but that's a probably a very very slow solution.

+13  A: 

This depends entirely on how smart you want the algorithm to be.

For instance, here are some issues:

  • cropped images vs. an uncropped image
  • images with a text added vs. another without
  • mirrored images

The easiest and simplest algorithm I've seen for this is just to do the following steps to each image:

  1. scale to something small, like 64x64 or 32x32, disregard aspect ratio, use a combining scaling algorithm instead of nearest pixel
  2. scale the color ranges so that the darkest is black and lightest is white
  3. rotate and flip the image so that the lighest color is top left, and then top-right is next darker, bottom-left is next darker (as far as possible of course)

Edit A combining scaling algorithm is one that when scaling 10 pixels down to one will do it using a function that takes the color of all those 10 pixels and combines them into one. Can be done with algorithms like averaging, mean-value, or more complex ones like bicubic splines.

Then calculate the mean distance pixel-by-pixel between the two images.

To look up a possible match in a database, store the pixel colors as individual columns in the database, index a bunch of them (but not all, unless you use a very small image), and do a query that uses a range for each pixel value, ie. every image where the pixel in the small image is between -5 and +5 of the image you want to look up.

This is easy to implement, and fairly fast to run, but of course won't handle most advanced differences. For that you need much more advanced algorithms.

Lasse V. Karlsen
What is a "combining scaling algorithm"?
Gregg Lind
+5  A: 

The 'classic' way of measuring this is to break the image up into some canonical number of sections (say a 10x10 grid) and then computing a histogram of RGB values inside of each cell and compare corresponding histograms. This type of algorithm is preferred because of both its simplicity and it's invariance to scaling and (small!) translation.

+3  A: 

Use a normalised colour histogram. (Read the section on applications here), they are commonly used in image retrieval/matching systems and are a standard way of matching images that is very reliable, relatively fast and very easy to implement.

Essentially a colour histogram will capture the colour distribution of the image. This can then be compared with another image to see if the colour distributions match.

This type of matching is pretty resiliant to scaling (once the histogram is normalised), and rotation/shifting/movement etc.

Avoid pixel-by-pixel comparisons as if the image is rotated/shifted slightly it may lead to a large difference being reported.

Histograms would be straightforward to generate yourself (assuming you can get access to pixel values), but if you don't feel like it, the OpenCV library is a great resource for doing this kind of stuff. Here is a powerpoint presentation that shows you how to create a histogram using OpenCV.

+1  A: 

Don't video encoding algorithms like MPEG compute the difference between each frame of a video so they can just encode the delta? You might look into how video encoding algorithms compute those frame differences.

Look at this open source image search application It describes several image similarity algorighms, three of which are from the MPEG-7 standard: ScalableColor, ColorLayout, EdgeHistogram and Auto Color Correlogram.

+1  A: 

If this is something you will be doing on an occasional basis and doesn't need automating, you can do it in an image editor that supports layers, such as Photoshop or Paint Shop Pro (probably GIMP or Paint.Net too, but I'm not sure about those). Open both screen shots, and put one as a layer on top of the other. Change the layer blending mode to Difference, and everything that's the same between the two will become black. You can move the top layer around to minimize any alignment differences.

Mark Ransom