views:

123

answers:

2

I have made a videochat, but as usual, a lot of men like to ehm, abuse the service (I leave it up to you to figure the nature of such abuse), which is not something I endorse in any way, nor do most of my users. No, I have not stolen chatroulette.com :-) Frankly, I am half-embarassed to bring this up here, but my question is technical and rather specific:

I want to filter/deny users based on their video content when this content is of offending character, like user flashing his junk on camera. What kind of image comparison algorithm would suit my needs?

I have spent a week or so reading some scientific papers and have become aware of multiple theories and their implementations, such as SIFT, SURF and some of the wavelet based approaches. Each of these has drawbacks and advantages of course. But since the nature of my image comparison is highly specific - to deny service if a certain body part is encountered on video in a range of positions - I am wondering which of the methods will suit me best?

Currently, I lean towards something along the following (Wavelet-based plus something I assume to be some proprietary innovations): http://grail.cs.washington.edu/projects/query/

With the above, I can simply draw the offending body part, and expect offending content to be considered a match based on a threshold. Then again, I am unsure whether the method is invariable to transformations and if it is, to what kind - the paper isn't really specific on that.

Alternatively, I am thinking that a SURF implementation could do, but I am afraid that it could give me false positives. Can such implementation be trained to recognize/give weight to specific feature?

I am aware that there exist numerous questions on SURF and SIFT here, but most of them are generic in that they usually explain how to "compare" two images. My comparison is feature specific, not generic. I need a method that does not just compare two similar images, but one which can give me a rank/index/weight for a feature (however the method lets me describe it, be it an image itself or something else) being present in an image.

+1  A: 

Looks like you need not feature detection, but object recognition, i.e. Viola-Jones method. Take a look at facedetect.cpp example shipped with OpenCV (also there are several ready-to-use haarcascades: face detector, body detector...). It also uses image features, called Haar Wavelets. You might be interested to use color information, take a look at CamShift algorithm (also available in OpenCV).

Cfr
Well, the problem is not all users show their faces. So I need something not trained/adapted/specific to face recognition. Will facedetect.cpp be of any help then?
amn
This not face recognition, but object detection framework. It can be trained to detect anything containing set of features (cars, eyes, body...). Here is tutorial http://lab.cntl.kyutech.ac.jp/~kobalab/nishida/opencv/OpenCV_ObjectDetection_HowTo.pdf
Cfr
Yeah, sorry, I've written "object recognition" in original answer. Viola-Jones is object detection framework, originally used to detect faces (not recognize). Facedetect.cpp shows how to use it with nested cascades, you can use it with any other cascades, not only faces.
Cfr
Thank you, I will look into it.
amn
+1  A: 

This is more about computer vision. You have to recognize objects in your image/video sequence, whatever... for that, you can use a lot of different algorithms (most of them work in the spectral domain, that's why you will have to use a transformation).

In order to be accurate, you will also need a knowledge base or, at least, some descriptors that will define the object.

Try OpenCV, it has some algorithms already implemented (and basic descriptors included).

There are applications/algorithms out there that you can "train" (like neural networks) and are able to identify objects based on the training. Most of them (at least, the good ones) are not very popular and can only be found in research groups specialized in computer vision, object recognition, AI, etc.

Good luck!

J--