views:

428

answers:

6

One of my friends came up with an interesting problem - Assume that we have a set of images in the system. Now, some one might submit a new image by slightly modifying any of the images already submitted, and in that case, the system should report that the submitted image is a forged image.

I can think about two solutions.

Solution 1 - Do an image comparison (bitmap based) for each input image with the given images in the database, probably after converting them to gray scale to counter color changing tricks, and after resizing them to a standard size.

Solution 2 - Create a Self Organized Map and train with all the existing images. And if some one submits an image, if it has a close match, report it as forged.

It might not be possible to have a system with more than 90% accuracy. Please share your thoughts/suggestions/solutions.

Edit after going through few answers: I already have a backprop neural network and an xml based language to train neural networks here - http://www.codeproject.com/KB/dotnet/neuralnetwork.aspx

I'm looking forward for specific answers for the problem I described above.

Thanks

+2  A: 

Good question, but depends on how much code you want to write. What if I mirror/flip an image, cut&paste with-in images. When you solve this problem, you've cracked most CAPTCHA too?

If you have alot of horsepower and programming man-hours you might want to look at Forier Transformations and Historgams to find matches. This would identify flip/mirror copy/paste.

Then create lots of fragments of tests, like unit tests(?) for things like "can this bit of image be found in the source" "can this bit when hue-rotated be found" etc etc.

Very open ended problem

Dead account
Not really. We are talking about identifying minor manipulations. Like, modifying only the faces of people in photos.
amazedsaint
+2  A: 

Guess you can start with Image Recognition with Neural Networks.

Basically I think it covers your Solution 2 approach. At least you'll find useful guidance for Neural Networks and how to train them.

Jorge Córdoba
Not actually. That article talks about a feed-forward back-propagation artificial neural network, while the OP is talking about a self-organizing map, a very different concept.
Bruno Reis
Well, I already have a backprop neural network and an xml based language to create and train backprop neural networks - http://www.codeproject.com/KB/dotnet/neuralnetwork.aspx - As Bruno mentioned, I'm talking about a SOM network. And AForge in codeproject is a good implementation of SOM. Looking for more detailed answers
amazedsaint
+2  A: 

There is certainly a trade-off between performance and accuracy here. You could use neural networks but may need some pre-transformations first: e.g. http://en.wikipedia.org/wiki/Image%5Fregistration. There are several more efficient algorithms like histogram comparison. Check The segmentation article at Wikipedia: en.wikipedia.org/wiki/Segmentation_%28image_processing%29

bja
+1  A: 

I think the simplest solution would be to simply invisibly digitally watermark images that are already in the system, and new images as they are added.

As new images are added, simply check for traces of the digital watermark.

Winston Smith
Oops, just clarified the question a bit more. The user "Now, some one might submit a new image by slightly modifying any of the images already submitted"
amazedsaint
A: 

No offense, but this might be a "if you only know a hammer, every problem looks like a nail"-type of situation. Artificial neural networks aren't a good solution for everything. If you simply calculated a pixel-by-pixel mean squared difference between the stored images and the "forge candidate", you could probably get judge image similarity more reliably.

I'd also suggest resizing all images to e.g. 50x50 pixels and performing a histogram equalization before comparing them. That way you could ignore image resizing and global brightness contrast changes.

nikie
Sure, as I mentioned, we can't go for a 100% solution. The objective of the question is to come up with an approach that is most suitable
amazedsaint
A: 

After some research, I've decided that the best way is to use the Self organizing maps (SOM) approach.

The idea is to self train the SOM network initially with the available/valid images, and then when a new image is inserted, find the nearest images and if matches found under a threshold, report the same.

AForge is an excellent library with SOM support (http://code.google.com/p/aforge/)

Information on basic SOM here

A good read on SOM here

amazedsaint