views:

555

answers:

10

I'm trying to come up with a good enough anti-spamming mechanism to prevent automatically generated input. I've read that techniques like captcha, 1+1=? stuff work well, but they also present an extra step impeding the free quick use of the application. (I'm not looking for anything like that please).

I've tried setting some hidden fields in all of my forms, with display: none; However, I'm almost certain a robot (which is essentially a program) can be configured to trace that form field id and simply not fill it.

Do you implement/know of a good anti automatic-form-filling-robots method? Is there something that can be done seamlessly with HTML AND/OR server side processing, and be (almost) foolproof? (and please no JS, one could simply disable it, and there goes my anti-spam method).

Btw I'm trying not to rely on sessions for this (like, counting how many times a button is clicked to prevent overloads).

A: 

http://recaptcha.net/

recaptcha is a free antibot service that helps digitize books

Anantha Kumaran
As a user I find recaptcha to be hard to figure out often times. Some of the words are so hard to read, that you end up having to try 3 or 4 times. Although this definitely will help with the robots problem.
Brian
What Brian said and:http://yro.slashdot.org/story/10/03/02/0135238/Scalpers-Earned-25M-Gaming-Online-Ticket-Sellers
SF.
A: 

Many of those spam-bots are just server-side scripts that prowl the web. You can combat many of them by using some javascript to manipulate the form request before its sent (ie, setting an additional field based on some client variable). This isn't a full solution, and can lead to many problems (eg, users w/o javascript, on mobile devices, etc), but it can be part of your attack plan.

Here is a trivial example...

<script>
function checkForm()
{
    // When a user submits the form, the secretField's value is changed
    $('input[name=secretField]').val('goodValueEqualsGoodClient');

    return true;
}
</script>

<form id="cheese" onsubmit="checkForm">
<input type="text" name="burger">

<!-- Check that this value isn't the default value in your php script -->
<input type="hidden" name="secretField" value="badValueEqualsBadClient">

<input type="submit">
</form>

Somewhere in your php script...

<?php

if ($_REQUEST['secretField'] != 'goodValueEqualsGoodClient')
{
    die('you are a bad client, go away pls.');
}

?>

Also, captchas are great, and really the best defense against spam.

John Himmelman
Thanks, though javascript can be easily disabled in any browser, thus annihilating my "anti spam mechanism", so I'm looking for something more global.
sombe
I may be wrong, but wouldn't this tell every JS-disabled user 'you are a bad client, go away pls.'?
sombe
Gal, its a __trivial__ example, merely demonstrating how to validate against a request var set by client-side js.
John Himmelman
A: 

Another option instead of doing random letters and numbers like many websites do, is to do random pictures of recognizable objects. Then ask the user to type in either what color something in the picture is, or what the object itself is.

All in all, every solution is going to have its advantages and disadvantages. You are going to have to find a happy median between too hard for users to pass the antispam mechanism and the number of spam bots that can get through.

Brian
Good idea. I wouldn't use colour as the criteria though, as this may exclude colourblind users
Neil Aitken
Yes, good point. Actually a problem with images in general is that they are not accessible, and by making them "accessible" with alt tags, robots can easily figure them out.
Brian
A: 

Robots cannot execute JavaScript so you do something like injecting some kind of hidden element into the page with JavaScript and then detecting it's presence prior to form submission but beware because some of your users will also have JavaScript disabled

Otherwise I think you will be forced to use a form of client proof of "humanness"

Nick Allen - Tungle139
Smart robots _can_ execute javascript. By doing a javascript solution you're blocking 99% of robots though
Ben Scheirman
A: 

The best solution I've found to avoid getting spammed by bots is using a very trivial question or field on your form.

Try adding a field like these :

  • Copy "hello" in the box aside
  • 1+1 = ?
  • Copy the website name in the box

These tricks require the user to understant what must be input on the form, thus making it much harder to be the target of massive bot form-filling.

EDIT

The backside of this method, as you stated in your question, is the extra step for the user to validate its form. But, in my opinion, it is far simpler than a captcha and the overhead when filling the form is not more than 5 seconds, which seems acceptable from the user point of view.

Thibault Falise
+6  A: 

An easy-to-implement but not fool-proof (especially on "specific" attacks) way of solving anti-spam is tracking the time between form-submit and page-load.

Bots request a page, parse the page and submit the form. This is fast.

Humans type in a URL, load the page, wait before the page is fully loaded, scroll down, read content, decide wether to comment/fill in the form, require time to fill in the form, and submit.

The difference in time can be subtle; and how to track this time without cookies requires some way of server-side database. This may be an impact in performance.
Also you need to tweak the threshold-time.

Pindatjuh
Thanks! this is a great idea, and close to what I was looking for.
sombe
Watch out if you want to allow end users to use automatic form fillers such https://addons.mozilla.org/en-US/firefox/addon/1882 that may allow very fast submission. As well as captcha any thing annoying the final user is generally not good, and especially when preventing a person in a hury from going (very) fast.
snowflake
Good point, but it all depends on the context. If the form is a login-form, I completely agree with you. But why disable login from bots? If the context is a comment box, like this one on StackOverflow, I know for sure: if you use auto-fill on a comment box then you are a spammer. Note that if you use auto-fill for signatures, you still require time to actually type content.
Pindatjuh
Note that SO does something like this. Edit a comment to fast or too many times in a row and you will get presented with the "Are you a human?" page.
calmh
A: 

A very simple way is to provide some fields like <textarea style="display:none;" name="input"></textarea> and discard all replies that have this filled in.

Another approach is to generate the whole form (or just the field names) using Javascript; few bots can run it.

Anyway, you won't do much against live "bots" from Taiwan or India, that are paid $0.03 per one posted link, and make their living that way.

SF.
That's a pretty good point.
sombe
A: 

There is a tutorial about this on the JQuery site. Although it's JQuery the idea is framework independent.

If JavaScript isn't available then you may need to fall back to CAPTCHA type approach.

Pool
+1  A: 

I actually find that a simple Honey Pot field works well. Most bots fill in every form field they see, hoping to get around required field validators.

http://haacked.com/archive/2007/09/11/honeypot-captcha.aspx

If you create a text box, hide it in javascript, then verify that the value is blank on the server, this weeds out 99% of robots out there, and doesn't cause 99% of your users any frustration at all. The remaining 1% that have javascript disabled will still see the text box, but you can add a message like "Leave this field blank" for those such cases (if you care about them at all).

(Also, noting that if you do style="display:none" on the field, then it's way too easy for a robot to just see that and discard the field, which is why I prefer the javascript approach).

Ben Scheirman
Do you think bots actually go through the css file and figure it's display:none; ?I really rather not use a JS-based solution, since it can be easily disabled.
sombe
It seems to be an old solution for webmasters including tons of non pertinent key words in order to boost their webranking. I think search crawler bots such google ones can figure it's display:none. Why would other bots not able to do that ?
snowflake
The bot would have to execute javascript, that's the point.Gal - for the tiny tiny percentage of your users with javascript turned off, you simply have a label that says "Leave this blank". No harm done.
Ben Scheirman
A: 

the easy way i found to do this is to put a field with a value and ask the user to remove the text in this field. since bots only fill them up. if the field is not empty it means that the user is not human and it wont be posted. its the same purpose of a captcha code.

matthew

related questions