views:

1083

answers:

7

Do you guys know of any algorithms that can be used to compute difference between images?

Take this webpage for example http://tineye.com/ You give it a link or upload an image and it finds similiar images. I doubt that it compares the image in question against all of them (or maybe it does).

By compute I mean like what the Levenshtein_distance or the Hamming distance is for strings.

By no means do I need to the correct answer for a project or anything, I just found the website and got very curious. I know digg pays for a similiar service for their website.

+1  A: 

What TinEye does is a sort of hashing over the image or parts of it (see their FAQ). It's probably not a real hash function since they want similar "hashes" for similar (or nearly identical) images. But all they need to do is comparing that hash and probably substrings of it, to know whether the images are similar/identical or whether one is contained in another.

Joey
+1  A: 

Heres an image similarity page, but its for polygons. You could convert your image into a finite number of polygons based on color and shape, and run these algorithm on each of them.

John Ellinwood
A: 

Correlation techniques will make a match jump out. If they're JPEGs you could compare the dominant coefficients for each 8x8 block and get a decent match. This isn't exactly correlation but it's based on a cosine transfore, so it's a first cousin.

John at CashCommons
+2  A: 

One technique is to use color histograms. You can use machine learning algorithms to find similar images based on the repesentation you use. For example, the commonly used k-means algorithm. I have seen other solutions trying to analyze the vertical and horizontal lines in the image after using edge detection. Texture analysis is also used.

A recent paper clustered images from picasa web. You can also try the clustering algorithm that I am working on.

cdv
+3  A: 

Consider using lossy wavelet compression and comparing the highest relevance elements of the images.

Sparr
I've done some work using just this technique and gotten good results (although not good enough to fund a full development project).
Larry OBrien
+8  A: 

The very simplest measures are going to be RMS-error based approaches, for example:

These probably gel with your notions of distance measures, but their results are really only meaningful if you've got two images that are very close already, like if you're looking at how well a particular compression scheme preserved the original image. Also, the same result from either comparison can mean a lot of different things, depending on what kind of artifacts there are (take a look at the paper I cite below for some example photos of RMS/PSNR can be misleading).

Beyond these, there's a whole field of research devoted to image similarity. I'm no expert, but here are a few pointers:

  • A lot of work has gone into approaches using dimensionality reduction (PCA, SVD, eigenvalue analysis, etc) to pick out the principal components of the image and compare them across different images.

  • Other approaches (particularly medical imaging) use segmentation techniques to pick out important parts of images, then they compare the images based on what's found

  • Still others have tried to devise similarity measures that get around some of the flaws of RMS error and PSNR. There was a pretty cool paper on the spatial domain structural similarity (SSIM) measure, which tries to mimic peoples' perceptions of image error instead of direct, mathematical notions of error. The same guys did an improved translation/rotation-invariant version using wavelet analysis in this paper on WSSIM.

  • It looks like TinEye uses feature vectors with values for lots of attributes to do their comparison. If you hunt around on their site, you eventually get to the Ideé Labs page, and their FAQ has some (but not too many) specifics on the algorithm:

    Q: How does visual search work?

    A: Idée’s visual search technology uses sophisticated algorithms to analyze hundreds of image attributes such as colour, shape, texture, luminosity, complexity, objects, and regions.These attributes form a compact digital signature that describes the appearance of each image, and these signatures are calculated by and indexed by our software. When performing a visual search, these signatures are quickly compared by our search engine to return visually similar results.

This is by no means exhaustive (it's just a handful of techniques I've encountered in the course of my own research), but if you google for technical papers or look through proceedings of recent conferences on image processing, you're bound to find more methods for this stuff. It's not a solved problem, but hopefully these pointers will give you an idea of what's involved.

tgamblin
+1  A: 

here is some code i wrote, 4 years ago in java yikes that does image comparisons using histograms. dont look at any part of it other than buildHistograms()

https://jpicsort.dev.java.net/source/browse/jpicsort/ImageComparator.java?rev=1.7&view=markup

maybe its helpful, atleast if you are using java

mkoryak