views:

295

answers:

2

How to recognize (in)appropriate images?

To facilitate, enable and easify photo and image moderation and administration targeting gae, I try get started with basic python image recognition ie basic semantic information what the image looks like to hold back doubtful material until human can judge it, and to approve the most that are good. A test batch > 10 000 images had one or just a very few so avoiding false positives naturally is good. I found the following links to follow and thank you all in advance for all advice, suggestions and recommendations. Very basically moderation will display a number of images and just a button "ok" or viceversa default "ok" and a button "Disapprove" depending on default decision (default probably publish everything and ad hoc (human) disapproval if some unsuitable since the absolute major part > 99 % material is suitably good) link text

link text

+2  A: 

In python you could always:

import supreme_court

Because when it comes to pornography, they know it when they see it.

Mediocre jokes aside, I would develop a bunch of fuzzy image recognizers that match easy things (like how much of the image is made up of a skin color tone?). You could probably come up with a good amount of suspicious variables at this point - this is the hard(ish) part. Then use Classification and Regression Trees to implement the actual decision engine. Train it with your training sample, then do cross-sample validation to get a sense of the false positives/negatives.

Vince
It's perfectly correct that nudeness should get censored for my purpose, however recognizing escort service is very similar and those photos look exactly like regular, so text should complement the pattern. As seen after in court, I definitely follow the informative link and could use gae histogram heuristica ie images.Image(d).histogram()
LarsOn
+2  A: 

I believe you will want to start here

http://en.wikipedia.org/wiki/Feature_detection_%28computer_vision%29

and then brush up on your statistical theory, reading any papers on the topic.

Unknown
Thanks, we all love statistics
LarsOn