views:

94

answers:

5

I am considering allowing users to post to my site without having them register or provide any identifying information. If each post is sent to a db queue and I then manually screen these posts, what sort of issues might I run into? How might I handle those issues? tia

+2  A: 

Tedium seems to be the greatest concern – screening posts manually is effective against spam (I'm assuming this is what you want to weed out) but very boring.

It could be best fixed with a cup of coffee and nice music to listen to while weeding?

Henrik Paul
lol +1 for originality
Seb
+5  A: 

The most obvious issue is that you'll get overwhelmed by the number of submissions to screen, if your site is sufficiently popular.

I would make sure to add some admin tools, so you can automatically kill all posts from a particular IP address, or that match a particular regex. That should help get rid of obvious spam faster, but again, you'd have to be behind the wheel for all of that.

Daniel Lew
A: 
  • posts that attempt to look legit but aren't
  • the sheer volume

These are the issues that I see on my blog.

cdonner
+5  A: 

Screening every post would be tedious and tiresome. And prone to annoying admin spam. My suggestion would be to automate as much of screening as possible. And besides, providing any identifying information does nothing to prevent spam (a bot will just generate it).

A lot of projects implement recognition system: first the user has to post 1-2 posts that are approved, then by IP and (maybe) a cookie he's identified as a trusted poster, so his posts appear automatically (and later can be marked as spam).

Also some heuristics on the content of the post could be used (like amount of links in the post) to automatically discard potential spam posts.

frgtn
+1  A: 

I've found that asking for the answer to a simple question sent the browser as an image (like "2 + 3 - 4 =", a varient of a 'captcha' but not so annoying), with a wee bit of Javascript does quite well.

Send your form with the image and answer field, and a hidden field with a "challenge" (some randomly generated string). When the user submits the form, hash the challenge and the answer, and send the result back to the server. The server can check for a valid answer before adding it to the database for review.

It seems like a lot of work up front, but it will save hours of review time. Using jQuery:

<script type="text/javascript">
//   Hash function to mask the answer
function answerMask()
{
  var a = $('#a').val();
  var c = $('#c').val();
  var h = hex_md5(hex_md5(a) + c);
  $('#a').val(h);
}
</script>
  <form onsubmit="answerMask()" action="/cgi-bin/comment.py" method="POST">
    <table>
      <tr><td>Comment</td><td><input type="text" name="comment" /></td></tr>
      <tr><td># put image here #</td><td><input id="p" type="text" name="a" size="30" /></td></tr>
      <tr><td><input id="c" type="hidden" value="ddd8c315d759a74c75421055a16f6c52" name="c" /></td><td><input type="submit" value=" Go "></td></tr>
    </p>
  </form>


Edit update...

I saw this technique on a web site, I'm not sure which one, so this idea isn't mine but you might find it useful.

Provide a form with a challenge field and a comment field. Prefix the challenge with "Pick the third word from: glark snerm hork morf" so the words, and which one to pick, are easy to generate on the server and easy to validate when the form contents come back.

The point is to make the user do something, apply a few brain cells, and more work than it's worth for a script kiddie.

Joel
jQuery isn't the fix for everything. Checks like this one should be made server-side for real security. JavaScript may be disabled in any browser, specially in spam bots, so this wouldn't help much.
Seb
I'm not surprised this was voted down, and I understand that JQuery isn't always the answer, but it worked easily for me because of my existing web pieces already loaded into the browser for other uses. I have another answer that might be better...
Joel