views:

453

answers:

9

We have a setup a product management system where the head of the product development can upload pictures of products and those pictures will instantly show up on our website.

Last week, when one person left the job, he uploaded a bunch of XXX-Rated pictures and things showed up immediately on the website. Luckily, we noticed and removed them in matter of a few minutes. Still, that time was enough to shock some of our customers.

Here is my question: How to effectively analyse and block such images from being uploaded? Can this be done using any library in PHP? Is it possible with PHP in the first place?

Any help would be much appreciated.

Edit 1:

After some digging, I found this:

http://pikture.logikit.net/

Has anyone used it before, or any opinion about it?

+4  A: 

It's a difficult problem in any language, the image processing problem is extremely hard.

There is some research on the subject:

Mark E
+1  A: 

It's possible although you will get false negatives and false positives.

Much simpler would be to implement some form of moderation system which requires approval from another person before anything goes live.

mopoke
+7  A: 

Implement image premoderation. If there's a single person uploading those (like you imply), shouldn't be too hard for someone else to take a brief look and click "Approve" on each.

And it's way cheaper, and more reliable, than using any library. You cannot formalize "adult".

Seva Alekseyev
Justice Potter Stewart agrees! http://en.wikipedia.org/wiki/I_know_it_when_I_see_it
Bill Karwin
Second this approach! Know as "four eyes" in banking somebody submits, a differnet person needs to approve before it takes affect.Would not stop your Junior getting the approvers password as well though.
James Anderson
A sensible and practical solution
PCBEEF
I agree that's a sensible solution. I should give it a shot. Thank you very much!
Nirmal
+3  A: 

There is no reliable way of doing this. Whatever automated process you use will generate false positive and false negatives. If you're willing to accept that you could look at the Image Nudity Filter, which analyzes images based on skintone. It will tend to flag images that are close-ups of faces too however.

Such a method might be effective in conjunction with information about a user's history. If they suddenly upload hundreds of images (when they might average two a week prior to that) and most of them get flagged then it's probably want to suspend those uploads until someone looks at them.

Come to think of it, simply uploading a large number of images in a short period of time (and bringing that to the attention of the site admins) is probably sufficient most of the time.

cletus
With all seriousness, wouldn't it be wise to have a filter look for a serious jump from brown to peach color (genital and nipples and what not)?
Doug
Noted down your points. Thanks for the valuable suggestion.
Nirmal
A: 

I don't think you can guarantee that any software will 100% accurately identify images like that.

You need to have a human moderating the uploads. Don't make them instantly visible on your website.

pavium
+1  A: 

You're about to go down the "skintone detection" path of preventing porn. This is a mistake since any solution, even if possible, will be hideously expensive to get right.

And wouldn't work on porn images of the character from "Avatar", for example :-)

The two approaches you could take (I prefer the first).

  1. Don't allow immediate publishing of those images until they've been cleared by a trusted party.
  2. Allow your users to mark them as offensive and check out the ones that get votes.

The reason I prefer the first is because the second still allows the images to be seen by the general public and this will almost certainly damage the company rep.

You should take that rep seriously. I am reminded of a system we set up many years ago to measure water skiing jump lengths.

There were three stations on either side of the river where the people tasked to measure would point their sights at where they thought the skier landed (hexangulation, I guess).

These six locations were basically averaged (after removing outliers) to get the actual position (hence length) and this was all calculated by the computer and sent back to a base station.

However, the TV stations which were transmitting the event had a person read the length off the screen and then manually type them into their own computers. When told that we could automate the process, they made it clear that they would not allow unchecked data to be sent to their viewers (in case we inserted profanity, I guess).

While I thought this a little paranoid, I could understand it.

paxdiablo
+1  A: 

It is really easy to add moderation, as has been suggested, but I imagine that there is some hesitation in terms of limiting productivity and frustrating developers who have to wait on a second set of eyes. Even if all participants understand and don't take it personally, it can still be annoying to:

a) be the person who has to wait a day or two for content to go live because their moderator is out sick and the back-up moderator is in a meeting.

or

b) be the person who gets interrupted every hour or so from a project to click "approve".

Sometimes even technology calls for some of the suggested bureaucracy and needs to accept the headaches it comes with, and I think there are definitely smart ways of keeping the annoyances like the ones I mentioned to a minimum.

Having said all of that, bear in mind that this manager's actions probably reflect his termination. Or at the very least how it took it. Perhaps another non-technical administrative solution would be to remove system access privileges as soon as you know someone is in a compromised position.

Whether the person turned in their two-week notice or had just gotten the sack that morning, they should either have moderated-access or no access at all to anything that goes public or is mission critical.

Most developers work really hard to gain the trust of superiors allowing them to make things public or write data directly to critical systems. But that trust should come with a very immediate exit-policy clause.


Regarding LogiPik:

I've never used it, but I just tried their demo. I uploaded the following (randomly looking for "close up" on google:

http://www.ethereality.info/ethereality_website/paintings_drawings/new/portraits/elena_formal_portrait/closeup-3.jpg

And it got back the following:

Evaluated as: porn Completed in 25 sec.

For me, that is too slow (for one picture) to get a false-positive, especially for $50 bucks American.

Anthony
+1 Hah! Thanks for the comment on the software. I am too worried about the situation, I just tried uploading actual porn pictures to test its functionality. But I have some hope with it because we only deal with furniture products. And I guess it wouldn't block a maple coloured furniture for possible skin tone!
Nirmal
"wait a day or two ... because ... the back-up moderator is in a meeting" - you must have some pretty extreme meetings at your workplace :-)
paxdiablo
A: 

Trynt API has a very good / fast Nudity detection Web Service, but it seems to return a 502 header ATM.

Alix Axel
A: 

Using current technology there is no completely reliable solution to this, so don't get sucked in by marketing that implies there is.

See this previous question on StackOverflow.

You could combine a skin detection algorithm with pre-moderation to bring suspected problem images to the attention of the human moderator though. Just don't think you can skip the step of getting a person to look at the images.

By the way, your company's security processes sound a bit weak. How did another employee upload images if only the head of product development should be able to do this? Make sure you store your passwords hashed and minimize access to production servers.

Ash
Nirmal