views:

106

answers:

2

Hi guys,

In one of our community sites, we allow users to upload images. These images are approved or rejected by our moderators.

To limit the work needed by our administrators, we want to 'log' each picture that is rejected to some kind of database, and do a lookup in this database prior to submitting an image for approval. If a similar image already has been rejected, the uploaded image won't be submitted for approval.

We can of course just log stuff like filename, size and MD5 of the picture for similarity, but it would definitely we could find differently cropped or resized images.

TinEye.com provides a similar functionality.

Do you know any kind of open-source software capable of this? Do you have any other ideas?

Thanks!

A: 

To detect resized and lossily compressed images you could resize the image to some standard size (like 40x40px) and then subtract the known image from the new image and compare the distance to a threshold.

Unfortunately this doesn't work with rotation or cropping. In that case you'd need to extract scale invariant features of the image.

Another problem of this approach is that with a naive implementation the computational cost is linear in the size of the list of known images, so it might get too expensive quickly to compare the new image against all old images.

CodeInChaos
A: 

You may build a list with "similar images" ever if they are not guaranteed 100% similarity. The similarity could be calculated by taking into account the image fingerprint (as Winner said, you may scalet it to a standard size and build a checksum from that). Also the "average" color might be used, and color "variation".

Based on this you may display a list of "similar images" (clickable thumbs) to the admin sorded in the order of "most likely to be similar"

You may alsoo look at Image::Compare http://linux.softpedia.com/get/Programming/Widgets/Perl-Modules/Image-Compare-43727.shtml and jpegDiff http://www.marengo-ltd.com/open_source/index.php

Quamis