views:

25

answers:

3

Please help me. Recently I've been getting lots of comments that don't address the posted content in any way, instead only indicating the cardinality of the comment.

I would like to stop these comments before they are posted by using a Javascript function to check if they are spam before submitting them.

Here's what I came up with.

var postHTMLContent = "...";

function isSpamComment(comment, index)
{
    if (index == 0 && comment == 'first')
       return true;
    else
       return false;
}

It works (no false positives) but lets a lot of other, similarly irrelevant comments through. It even fails if they simply misspell 'first' or get the cardinality of their comment wrong.

Is there a more general function that would stop the stuff that makes it through? Nothing server-side please, and no regexes.

+1  A: 

Spammers will get around your JavaScript attempts to block them.

You want to either delay comments to allow moderation or run it through some sanity tests server side.

If you can, do as StackOverflow does and allow the community itself to help protect the system. If a user posts something distasteful or stupid, let the others flag it for moderation, or delete it if they are "power"/"trusted" users.

scunliffe
A: 

You can keep adding new "banned comments", but your users will always get around them.

If you want to keep it spam free you have two alternatives, of which only the second one is viable:

  • Program an artificial intelligence that identifies "spam" comments
  • Don't display the comments until a human goes through them and vets them as not spam

This is of course assuming spam or noise comments are from your human readers. If you are trying to stop automated spam then it's technically possible, see CAPTCHAs.

Andreas Bonini
A: 

Really a JavaScript solution isn't a very good answer for this problem. If someone wants to post their 'first' comment they can just disable javascript and if you don't have any server side validation the spam will still reach the site. The second bad reason for javascript is that users can see what words you consider spammy and just self sensor them (f!rst...etc). That being said if there are many different ways you could do some more heuristic spam detection which I will detail below:

black listed terms - sort of like your example, if you do:

if(comment.indexOf(' bad phrase here ') !== false) { return true; }

you can figure out if a comment contains a term...not just is equal to the entire content. honestly the no regex clause is really hurting you in terms of what you can do with content detection but this should at least get you basic phrases within what a user types. This isn't fool proof though as you might end up with a situation where you get false positives for words like 'Classic' :)

comment length - consider the fact that anything less than 50 characters or so might not be constructive...that won't stop people from going "first!!!!!!!!!!!!!!!!!!!!!!!!!!!...you get the idea" to break your filter though. Definitely trim the white space around the user input to help ensure it is a little more secure.

Those are just a few basic ideas but honestly there is no point is doing just a fix for this on the client side. Trolls will be trolls and will always try to find away around your script if the server isn't willing to back up the client's rules.

That being said. Comment approval/moderation is the only sure fire way to ensure the content you want shows up on your site and honestly that is not a very good user experience when trying to take part in a dialogue online. It really should only be used in an environment where you need to protect people from content (like maybe if you are working on some product page marketed to kids).

mcgregok