How can I moderate trolls algorithmically?

views:

621

answers:

+22 Q:

How can I moderate trolls algorithmically?

I'm writing a forum application but one of the things I'm concerned about is trolls - users who disrupt dialog with abusive speech or off-topic content. This goes beyond spam prevention because it includes people who are actually participating in a discussion but who either refuse or are unable to follow proper standards of behavior.

Without an ability to moderate these types of users algorithmically (manual moderation would require too much time and attention), I don't see how the forum can be successful since their participation is not only disruptive but also discourages the types of users I am interested in, from participating.

Are there specific features I can add to my app that would minimize the disruptive effect of trolls while also minimizing the barriers to entry for new users?

The only feature I can think of that would not require as much active user moderation is identity verification - such as a cell phone number - which I would then actually have to verify belongs to this person. But this creates a significant barrier to entry.

Having a "flag this" link next to user content seems itself to be prone to abuse and require moderation.

+21 A:

You can't determine trolling algorithmically without Star Trek-like universal language parsers.

But you can do the next best thing, which is what Stack Overflow does: give various moderation rights to every user. Each user does a little bit of moderation so the primary moderators don't have to put in as much effort.

After very little time, users on SO are able to flag any post as offensive for moderators to see. After more time, they can edit posts, tags, close questions and even become mini moderators themselves. On SO the process of getting these rights is procedural, but the moderation itself is not. That's a good approach.

In lower-tech applications like simple forums or IRC channels, simply spreading the load is usually good enough. Instead of having a few moderators with all of the rights, you give regular, trustworthy users (of which there are probably many) a subset of rights to help with moderation. It works very well, since most active community members will do this kind of thing for free, just to see a better community.

Welbog 2009-06-11 15:50:40

The Gentle (The Penetrating), Wind Cultivating, InfluencingHad to look it up.

Kieveli 2009-06-11 15:55:24

A community will moderate itself as much as it can; giving it tools will help that along.

peacedog 2009-06-11 15:55:27

@Kieveli: You're the first to notice.

Welbog 2009-06-11 15:56:02

@oscar: If it's only 50 users, why is moderation an issue in the first place? I'm saying on low-tech sites you hand-pick the people who get the extra moderation rights. On a site as large as SO it can be procedural because, on average, people with a lot of upvotes are generally invested in the community in some way. Smaller sites are either too small to require that much moderation, or they're big enough that some of the members ought to be decent people (statistically this is probably the case).

Welbog 2009-06-11 16:08:51

@Weblog - I disagree that StackOverflow has solved this problem for smaller sites. The SO system works well because there are 3 million users here. That's a lot of people to choose moderators from. What if your site only has 50 users? Among those fifty you may not even have one reliable moderator candidate. What do you do then? The assumption that users who have received a certain threshold of votes are automatically qualified to serve as moderators under all circumstances is not an acceptable risk for me.

oscar 2009-06-11 16:08:55

Out-of-order comments brought to you by lack of editable comments!

Welbog 2009-06-11 16:09:33

@Weblog - sorry, tried to correct a typo.

oscar 2009-06-11 16:10:00

@Weblog - Moderation is an issue because if you have 100 small sites, that can take as much time to moderate as a single large site, if it's just you doing all the moderation. And I disagree that a site with only 50 users will not require moderation. I've seen the most ridiculous name-calling break out in even very small discussions - the equivalent of a bar fight. The SO approach is great. It works well here. But it's not going to work on these smaller sites.

oscar 2009-06-11 16:14:56

@oscar: I'm shocked that you have found a community of 50 users that are all scoundrels and you feel you need to protect them from each other in some way.

belgariontheking 2009-06-11 16:15:12

@oscar: If you have 100 small sites, odds are they are for different subjects, topics or what have you. Each site should have a stakeholder other than yourself who either wanted the site made in the first place or who is invested in the site. If your sites are attracting only trolls, then there is something wrong with the way you are attracting people. Again, there is no procedure that can help you there.

Welbog 2009-06-11 16:18:55

@belgariontheking - I am selfish and lazy. Selfish, because I'm not protecting them, I'm protecting me. I don't want mud on my sites making me look bad but also discouraging sensitive people from joining (costing me money). Lazy, because I don't want to have to do a lot of work moderating all of these sites. I just want to drink my coconuts in the Bahamas and let my ad money roll in. And, trust me, it's not hard to find 50 users that are all scoundrels. Anyone who attends soccer matches regularly would believe me.

oscar 2009-06-11 16:24:11

@Weblog - "If your sites are attracting only trolls, then there is something wrong with the way you are attracting people." Maybe it's not a problem with the types of people I am attracting but a problem with the way the site is structured. I believe the environment creates the behavior. By structuring the environment correctly, you can avoid the behavior. That's why I asked the question here.

oscar 2009-06-11 16:27:12

@oscar: I don't think you can create a structure that will curb that kind of behaviour without intervention from moderators. Anything procedural you can think up, people will eventually figure out what makes it tick and then circumvent it. The only think that can truly anticipate a human's behaviour is another human.

Welbog 2009-06-11 16:33:24

Regular, trustworthy users will very soon form a clique and your resource will become their preserve, not interesting or welcoming to anyone but themselves. Internets crawl with examples, esp. in the blogosphere.

Anton Tykhyy 2009-06-11 17:33:33

@Anton Tykhyy: So swap them out on a monthly basis. Don't keep the same set of moderators at all times. Probably a good idea anyway since active members don't always make the best moderators.

Welbog 2009-06-11 17:38:17

I don't think all this protect-the-user stuff is a good idea at all. Isn't this called paternalism? 2ch.net is the future. Trolls only thrive because a lot of people are mentally immature and therefore (almost by way of definition) easily trolled once you find the right pushbuttons.

Anton Tykhyy 2009-06-11 18:00:56

If 2ch.net is the future, I'm rooting for Skynet instead.

mquander 2009-06-11 18:25:55

+2 A:

The something awful forums have a $10 fee that can be only paid for by credit card. If you screw up you can be put on probation (lose posting access for X amount of time) or banned (have to pay another $10).

Of course, your users may not be willing to pay money, but it does help to keep away children (unless their parents are willing to give credit card access) and from trolls that don't really care about what they're doing.

chocojosh 2009-06-11 15:51:10

+2 A:

Your goal should be to use algorithms and automated processes to reduce the number of posts that your moderators must manually review. By running processes that check for posts containing hate speech or no text related to the original post (both very non trivial algorithms) you can reduce the workload for your moderators. It is also possible to promote your best community members to moderators to reduce the per person work load moderating the forum.

Matthew Vines 2009-06-11 15:53:08

+5 A:

At the very least, you can have a list of offensive words/phrases and create a script that runs when a user with less than X posts, or has been registered for less than Y days makes a post. If one or more of the offensive words/phrases is found, the script could send you a "possible troll" email.

Sam Pearson 2009-06-11 15:56:47

Bear in mind that actually filtering on that basis is very error-prone, and it easy to get scoffed at (probably not what you want).

David Thornley 2009-06-11 16:23:23

I like this. Careful choice of your naughty words will determine your success here. On a political site, do you ban the use of the word "bush?" or do you only ban it in posts that do not include the word "monkey?" on a C# site? same with "java" and "sucks." Whatever you do, don't post the naughty word list. Then you'll just get a bunch of posts talking about v@ginas and you'll never be able to catch up. Also, your users will all leave you to get jobs writing spam emails.

belgariontheking 2009-06-11 16:25:56

And bear in mind that, if you institute censorship, people will look for the limits of the system. Of course, if this is just used for flagging possible abuse, like Sam said, there's a lot fewer problems.

David Thornley 2009-06-11 16:57:31

Also, consider clbuttic, medireview and the Cupertino effect. [ http://thedailywtf.com/Articles/The-Clbuttic-Mistake-.aspx http://revealingerrors.com/wordlist_profanity http://revealingerrors.com/medireview etc.]

ShreevatsaR 2009-06-11 17:36:49

I would actually recommend against this -- it does not work well, although it is the very first thought many PMs come up with. :-/Failures is partly due to reasons already mentioned; but fundamentally because keyword targeting just does not work as well as contextual detection using bayesian/markov etc techniques.

StaxMan 2009-06-11 18:16:11

+8 A:

A good way to control such abuse is to eliminate, or reduce as much as possible, anonymity. If you have a scenario where people know their words can be traced back to them, you'd be surprised how much less likely they are to misbehave.

This article on ReadWriteWeb tackles the topic very well

Conrad 2009-06-11 16:04:34

This theory is supported by the following illustration: http://www.penny-arcade.com/comic/2004/03/19/

Joel Mueller 2009-06-11 16:43:52

I wonder why this is voted down. In my experience, removing anonymity is the single best technological troll repellent there is.

mquander 2009-06-11 16:49:19

Trolling is not necessarily a function of anonymity, it is more a function of being a misanthrope. As for your tracing people's words back to them: here's a news flash: people don't register on forums with their mailing address, SIN, and two examples of government-issued picture ID. Actually, they select a pseudonym and need only an easily-obtainable e-mail address and it's off to the races. In my experience, I have seen no difference in the amount of trolling between BBSes that allow and disallow anonymous/unregistered posting.

Cirno de Bergerac 2009-06-11 17:15:25

An essay by a guy who is in favor of anonymity: http://wakaba.c3.cx/shii/

Cirno de Bergerac 2009-06-11 17:17:00

sgm: I agree with that. However, I have seen a difference in the amount of trolling between pseudonymous communities like that and pseudonymous communities that have a bigger barrier to entry (e.g. they have a fee to register an account, have gated entry in some way so that you need an invitation to join, require RL identity verification, etc.) The latter sites have less trolls, and the trolls are less destructive.

mquander 2009-06-11 17:30:17

Let me amend that; the troll prevention comes from a combination of moderation and barrier-to-entry. Trolls that arise are banned, and once you're banned, you can't just sign up a new account in 5 minutes, so the result is not many trolls.

mquander 2009-06-11 17:32:35

OK, I realized that what I'm advocating is not actually related to anonymity, although they often come in the same package.

mquander 2009-06-11 17:37:10

There you go. Using a comic to prove your theory. Those who upvoted that comment should be ashamed.

belgariontheking 2009-06-23 14:36:36

+4 A:

I would suggest a clever way of flagging replies, but be very specific.

For example, registered users can flag a post/reply using various specific types such as:

Personal attack, ad hominem
Off topic
Spam

You could then provide a listing for admins to view the most flagged replies.

Jon 2009-06-11 16:45:26

Sounds familiar ;-p Perhaps make it so that if it goes over some limit, the system deals with it automatically ;-p

Marc Gravell 2009-06-11 17:25:21

@Marc: this kind of system works well on SO because there's so little contention here, and because 99% of people come here to help or to get help. This situation is hardly usual in the internets!

Anton Tykhyy 2009-06-12 00:32:17

Probably it's just a brainstorming idea.... you can collect many troll's post, perform a frequencies analysys on using of some particular words, emoticons, exclamation marks ecc ecc, then you can find a threshold level for each of this particular items and develop an algorithm to ananlize each post.

I reapeat, it's just a fool idea!

enother one, to avoid, for example, randomly typed words, is entropy, you can calculate the entropy of the post.

Lopoc 2009-06-11 17:29:12

Flags are a good way to go, do not underestimate the power of user feedback, whether that be by reputation, or by reporting offensive material to be reviewed by a mod.

Algorithms are going to be an awkward way forward, the star trek language parser is a little unrealistic but you could have a couple of dictionaries and match words contained in each dictionary separately that would say for example match a word like "anorexia" in one instance and then guess if its being used in a positive context by the surrounding words then flag it as questionable (unless you want pro-ana content). You would have to have dictionaries of words and phrases suggesting a positive sense, words and phrases that suggest a negative sense, words sensitive in positive context, words sensitive in negative context, then match those against the thread of the conversation average use of those terms being positive/negative so that posts going in the opposite direction of the thread could be flagged on the basis of disruptive to the flow of conversation if its sensitive subject areas then content to be reviewed by an admin. Pretty wishy washy, and only a small step from blind censorship but its about all i can think of (in my tiny incapable mind lol).

Toby 2009-06-11 18:22:26

+1 A:

Some people are trying to develop filters for stupidity or trolling. For example, Stupid Filter claims to be "an open-source filter software that can detect rampant stupidity in written English."

Steven 2009-06-15 18:15:16

ansaurus

tags:

views:

answers:

How can I moderate trolls algorithmically?

related questions