21012

117
+191  Q:

## Practical non-image based CAPTCHA approaches?

+3  A:

Someone also suggest the Raphael JavaScript library, which apparently let you draw on the client in all popular browsers:

http://dmitry.baranovskiy.com/raphael/

.. but that wouldn't exactly work with my `<noscript>` case, now would it ? :)

+3  A:

ASCII text isn't much more legible than the really screwy image captchas that are around these days.

I think the math puzzle would be a good fit here, since we're all supposed to be fairly math-oriented. Just don't ask me to do integration, please.

+2  A:

I like the word math problems. It would be interesting to try it out (at least it's easy to do) and see how the baddies respond.

+12  A:

Be sure it isn't something Google can answer though. Which also shows an issue with that --order of operations!

but is that what I _ment_? Does english follow order of operations? et cetera...
+11  A:

Although we all should know basic maths, the math puzzle could cause some confusion. In your example I'm sure some people would answer with "8" instead of "1".

Would a simple string of text with random characters highlighted in bold or italics be suitable? The user just needs to enter the bold/italic letters as the CAPTCHA.

In this case "stack" would be the CAPTCHA. There are obviously numerous variations on this idea.

Edit: Example variations to address some of the potential problems identified with this idea:

• using randomly coloured letters instead of bold/italic.
• using every second red letter for the CAPTCHA (reduces the possibility of bots identifying differently formatted letters to guess the CAPTCHA)
I like this one - for example "please enter the word spelled by the third underlined red letter, fourth bold green letter, and fifth non-bold blue letter".
That example above 'ssdfatwerweajhcsadkoghvefdhrffghlfgdhowfgh' could be solved by a simple regex
This would not be good for users with acalcula. There are scientists with this affliction so it isn't unreasonable that there could be programmers with it.
Excellent idea! Perhaps even by playing with changing foreground/background colors, you can get something that displays text easily visible to humans, but too random for bots? Of course this is harder on color-blind people :-(
For "there can be different answers for the math question": allow all possible answers - all we need is to prove that the user is human. Not a math whiz kid. (Anyway, I am not for match approach completely - if google can, every body can solve them. by sending a request to google.com :-)
None of these ideas offer much more protection than "Type in this word: STACK". If that's all you need, then fine. But don't fool yourselves into thinking that these approaches offer more protection.
Using colour might complicate things, because you would need to support different forms of colour-blindness. Otherwise this sounds pretty good.
+31  A:

The advantage of this approach is that, for most people, the CAPTCHA won't ever be visible!

I like this idea, is there not any way we can just hook into the rep system? I mean, anyone with say +100 rep is likely to be a human. So if they have rep, you need not even bother doing ANYTHING in terms of CAPTCHA.

Then, if they are not, then send it, I'm sure it wont take that many posts to get to 100 and the community will instantly dive on anyone seem to be spamming with offensive tags, why not add a "report spam" link that downmods by 200? Get 3 of those, spambot achievement unlocked, bye bye ;)

EDIT: I should also add, I like the math idea for the non-image CAPTCHA. Or perhaps a simple riddle-type-thing. May make posting even more interesting ^_^

What happens if a high karma members account credentials are stolen?
@nemo Then you deal with it. But very little reason to avoid a solution for this reason alone.
+11  A:

I mean, anyone with say +100 rep is likely to be a human. So if they have rep, you need not even bother doing ANYTHING in terms of CAPTCHA

Yeah, that's what I used to think, too. Note number of revisions on that post and their source. Hi Kevin!

So, CAPTCHA is mandatory for all users except moderators.

half the links in this answer are directed to 404's. Jeff can you better explain what you were getting at?
+21  A:

Explanation of Honeypot Captcha (which looks very good):Bots love forms. They fill out all the fields. A honeypot Captcha includes a field that is HIDDEN by CSS so only the bots (and those with IE 3.0) see it. If it's filled, it's a bot. Very easy to implement.
Again, trivially bypassable with a very minimal investment of time. True, you'll manage to block some scriptkiddies, but if your site has value that's not your main threat.
Yes, this is simple to deploy and works really well. Accessibility is the only real problem.
accessibility can by simple bypassed adding some text: `Hey, if youre a human, keep this field blank!`
I thought I was being clever when I set this up on my site, but looks like I'm certainly not the first to come up with the idea! This cut a huge amount of drive-by opportunist spam. Obviously, it won't catch everything, but every filter you add raises the bar a little.
A:

+1  A:

Would a simple string of text with random characters highlighted in bold or italics be suitable? The user just needs to enter the bold/italic letters as the captcha.

@Jared - I can barely pick out the bold letters in that string even when I'm trying. Maybe if we made the font HUGE. usability--;

Ty, Welcome to Stackoverflow. The comment features have not always existed.
This is not a very good approach since it's not much more effective than "Type in this word: stack".
@pbreitenbach -- this post was a "comment" responding to that suggested solution (before comments existed).
A:

@pc1oad1etter I also noticed that after doing my post. However, it's just an idea and not the actual implementation. Varying the font or using different colours instead of bold/italics would easily address usability issues.

+1  A:

Who says you have to create all the images on the server with each request? Maybe you could have a static list of images or pull them from flickr. I like the "click on the kitten" captcha idea. http://www.thepcspy.com/kittenauth

A:

@lance

Who says you have to create all the images on the server with each request? Maybe you could have a static list of images or pull them from Flickr. I like the "click on the kitten" CAPTCHA idea. http://www.thepcspy.com/kittenauth.

If you pull from a static list of images, it becomes trivial to circumvent the CAPTCHA, because a human can classify them and then the bot would be able to answer the challenges easily. Even if a bot can't answer all of them, it can still spam. It only needs to be able to answer a small percent of CAPTCHAs, because it can always just retry when an attempt fails.

This is actually a problem with puzzles and such, too, because it's extremely difficult to have a large set of challenges.

A:

@rob

What about a honeypot captcha? Wow, so simple! Looks good! Although they have highlighted the accessibility issue.. Do you think that this would be a problem at SO? I personally find it hard to imagine developers/programmers that have difficulty reading the screen to the point where they need a screen reader?

There are developers who are not just legally blind, but 100% blind. Walking cane and helper dog. I hope the site will support them in a reasonable fashion.

However, with the honeypot captcha, you can put a hidden div as well that tells them to leave the field blank. And you can also put it in the error message if they do fill it in, so I'm not sure how much of an issue accessibility really is here. It's definitely not great, but it could be worse.

+1  A:

I had a load of spam issues on a phpBB 2.0 site I was running a while back (the site is now upgraded).

I installed a custom captcha mod I found on the pbpBB forums that worked well for a period of time. I found the real solution was combining this with additional 'required' fields [on the account creation page].
I added; Location and Occupation (mundane, yet handy to know).
The bot never tried to fill these in, still assuming the captcha was the point of fail for each attempt.

A:

• ASCII is bad : I had to squint to find "WOW". Is this even correct? It could be "VVOVV" or whatever;
• Very simple arithmetic is good. Blind people will be able to answer. (But as Jarod said, beware of operator precedence.) I gather someone could write a parser, but it makes the spamming more costly.
• Trivia is OK, but you'll have to write each of them :-(

I've seen pictures of animals [what is it?]. Votes for comics use a picture of a character with their name written somewhere in the image [type in name]. Impossible to parse, not ok for blind people.

You could have an audio fallback reading alphanumerics (the same letters and numbers you have in the captcha).

Final line of defense: make spam easy to report (one click) and easy to delete (one recap screen to check it's a spam account, with the last ten messages displayed, one click to delete account). This is still time-expensive, though.

+143  A:

A method that I have developed and which seems to work perfectly (although I probably don't get as much comment spam as you), is to have a hidden field and fill it with a bogus value e.g.:

``````<input type="hidden" name="antispam" value="lalalala" />
``````

I then have a piece of JavaScript which updates the value every second with the number of seconds the page has been loaded for:

``````var antiSpam = function() {
if (document.getElementById("antiSpam")) {
a = document.getElementById("antiSpam");
if (isNaN(a.value) == true) {
a.value = 0;
} else {
a.value = parseInt(a.value) + 1;
}
}
setTimeout("antiSpam()", 1000);
}

antiSpam();
``````

Then when the form is submitted, If the antispam value is still "lalalala", then I mark it as spam. If the antispam value is an integer, I check to see if it is above something like 10 (seconds). If it's below 10, I mark it as spam, if it's 10 or more, I let it through.

``````If AntiSpam = A Integer
If AntiSpam >= 10
Comment = Approved
Else
Comment = Spam
Else
Comment = Spam
``````

The theory being that:

• A spam bot will not support JavaScript and will submit what it sees
• If the bot does support JavaScript it will submit the form instantly
• The commenter has at least read some of the page before posting

The downside to this method is that it requires JavaScript, and if you don't have JavaScript enabled, your comment will be marked as spam, however, I do review comments marked as spam, so this is not a problem.

@MrAnalogy: The server side approach sounds quite a good idea and is exactly the same as doing it in JavaScript. Good Call.

@AviD: I'm aware that this method is prone to direct attacks as I've mentioned on my blog. However, it will defend against your average spam bot which blindly submits rubbish to any form it can find.

VERSION THAT WORKS WITHOUT JAVASCRIPTHow about if you did this with ASP, etc. and had a timestamp for when the form page was loaded and then compared that to the time when the form was submitted. If ElapsedTime<10 sec then it's likely spam.
Very obviously bypassable, if a malicious user bothers to look at it. While I'm sure you're aware of this, I guess you're assuming that they won't bother... Well, if it's not a site of any value, then you're right and they wont bother - but if it is, then they will, and get around it easy enough...
The spammer could use some very old page load too.
@Iny: Can you explain please? I don't understand what you said...
Here's a twist on this that I use. Make the hidden value an encrypted time set to now. Upon post back, verify that between 10 seconds and 10 minutes has elapsed. This foils tricksters who would try to plug in some always-valid value.
@GateKiller: Iny is saying that a spam bot could delay their response. I.e. cache the page for a few seconds and then submit the postback later (meanwhile trawling other sites and caching those pages).
To all who have pointed out that bots could get past... This I know as I pointed out in the answer. It's a very simple method to stop your average bot and bored users. I am currently using it on my blog and so far, it has been 100% successful.
An approach such as this is easily circumvented by a custom bot that understands how to correctly submit the mangled form. Stack Overflow receives enough traffic that this would be worthwhile for a spammer to write.
Three cheers for security by obscurity!
I would just like to point out that if I were to write a spambot, I wouldn't be loading the entry-form, I would be submitting to the POST submission page.
@user257493: Your right, but this type of Captcha is only designed to stop casual bots and not focused attacks.
+2  A:

There was a CAPTCHA you talked about in your blog where you had to identify pictures of dogs or cats. That one has always been memorable to me.

+10  A:

@GateKiller

Good idea, but now that I know how it works I could just set the value of "antispam" to >= 10 when forging a POST request.

Most of the ideas here work great against spam bots but fail hard against attacks. I haven't even tried this, but I doubt there is flood protection; I'm sure someone could write a script to ask a new question every 30 seconds or so.

CAPTCHA is pointless, the best solution is:

1. Lock the thread when you realize an attack happening
2. Flag the user
3. Three(?) flags and you are temp-banned
+10  A:

Although this similar discussion was started:

We are trying this solution on one of our frequently data mined applications:

A Better CAPTCHA Control (Look Ma - NO IMAGE!)

You can see it in action on our Building Inspections Search.

You can view Source and see that the CAPTCHA is just HTML.

That will work for NOW, but as soon as enough sites use an approach like that, spammers will render the html to an image and OCR the result.
A:

How about showing nine random geometric shapes, and asking the user to select the two squares, or two circles or something.. should be pretty easy to write, and easy to use as well..

There's nothing worse than having text you cannot read properly...

+31  A:
This is lame...
+1  A:

Have you looked at Waegis?

"Waegis is an online web service that exposes an open API (Application Programming Interface). It gets incoming data through its API methods and applies a quick check and identifies spam and legitimate content on time. It then returns a result to client to specify if the content is spam or not."

+41  A:

What about Orange? Apologies if it should be obvious.
Jeff Atwood, founder of stackoverflow, has a blog at www.codinghorror.com. If you want to comment you have to solve a captcha. The captcha is an image that always display "orange".
Yeah, I never got that.. WTF?
You'd need to write a spam bot that specifically knew that to get past it, security via 'bot creators probably won't target me specifically'..
Incidentally, Jeff Atwood's codinghorror.com now uses reCAPTCHA.
+1  A:

Without an actual CAPTCHA as your first line of defense, aren't you still vulnerable to spammers scripting the browser (trivial using VB and IE)? I.e. load the page, navigate the DOM, click the submit button, repeat...

+19  A:

So, CAPTCHA is mandatory for all users except moderators. [1]

That's incredibly stupid. So there will be users who can edit any post on the site but not post without CAPTCHA? If you have enough rep to downvote posts, you have enough rep to post without CAPTCHA. Make it higher if you have to. Plus there are plenty of spam detection methods you can employ without image recognition, so that it even for unregistered users it would never be necessary to fill out those god-forsaken CAPTCHA forms.

+1 for the first sentence. :)
A:

I think they are working on throttling. It would make more sense just to disable CAPTCHA for users with 500+ rep and reset the rep for attackers.

A:

I recently (can't remember where) saw a system that showed a bunch of pictures. Each of the pictures had a character assigned to it. The user was then asked to type in the characters for some pictures that showed examples of some category (cars, computers, buildings, flowers and so on). The pictures and characters changed each time as well as the categories to build the CAPTCHA string.

The only problem is the higher bandwidth associated with this approach and you need a lot of pictures that are classified in categories. There is no need to waste much resources generating the pictures.

A:

One option would be out-of-band communication; the server could send the user an instant message (or SMS message?) that he/she then has to type into the captcha field.

This imparts an "either/or" requirement on the user -- either you must enable JavaScript OR you must be logged on to your IM service of choice. While it maybe isn't as flexible as some of the other solutions above, it would work for the vast majority of users.

Those with edit privileges, feel free to add to the Pros/Cons rather than submitting a separate reply.

Pros:

• Accessible: Many IM clients support reading of incoming messages. Some web-based clients will work with screen readers.

Cons:

• Javascript-disabled users are now dependent on up-time of yet another service, on top of OpenID.
• Bots will cause additional server resource usage (sending the out-of-band communications) unless additional protections are implemented
+1  A:

My solution was to put the form on a separate page and pass a timestamp to it. On that page I only display the form if the timestamp is valid (not too fast, not too old). I found that bots would always hit the submission page directly and only humans would navigate there correctly.

Won't work if you have the form on the content page itself like you do now, but you could show/hide the link to the special submission page based on NoScript. A minor inconvienience for such a small percentage of users.

+39  A:

Unless I'm missing something, what's wrong with using reCAPTCHA as all the work is done externally.

Just a thought.

Re-captcha is user-hostile. Captchs is bad enough. But making it harder for users in order to get some tiny OCR benefit is positively hostile.
recaptcha is great, but if your div isn't 300px then you are out of luck. SIGH
why is user-hostile? is spam user-friendly?
It's user-hostile because sometimes the images are hard to decode even for humans, and may cause frustration in legitimate users when this happens. See Josh's link with worst CAPTCHAS for some examples of overly hard to decode images.
@Andrei you can always make reCAPTCHA load another image if it is too hard for you.
reCAPTCHA is fine, and it implements an accessibility option which 95% of homegrown solutions don't even think about.
+165  A:

I don't like that humor rates above real suggestions
But that's nature of voting.
Good solutions can often be found in humorous suggestions.
Nice spoiler, THANKS!!
This answer brings back terrible memories.. what about when johnny 5 gets the crap beaten out of him on short ciruit 2? didn't they know he's alive! damn them!
so then does the fact that I've never seen that movie mean that I'm a replicant?
+7  A:

Best captcha ever! Maybe you need something like this for sign-up to keep the riff-raff out.

A:

My suggestion would be an ASCII captcha it does not use an image, and it's programmer/geeky. Here is a PHP implementation http://thephppro.com/products/captcha/ this one is a paid. There is a free, also PHP implementation, however I could not find an example -> http://www.phpclasses.org/browse/package/4544.html

I know these are in PHP but I'm sure you smart guys building SO can 'port' it to your favorite language.

+6  A:

I just use simple questions that anyone can answer:

What color is the sky?
What color is an orange?
What color is grass?

It makes it so that someone has to custom program a bot to your site, which probably isn't worth the effort. If they do, you just change the questions.

Cyc can solve this trivially... and it's open source. Would require at most a couple hours of scripting to implement.
this is used by ubuntu forum, also. i like it, and the implementations of checks like "2 + 2 = ?" or "what is the first letter of the alphabet" is very simple.
The answers: 1) Right now, a light blue, later, red, then black with hints of orange near downtown. 2) orange, unless it's moldy, then it's green or black or white. 3) brown, in Southern California, unless you're in Beverly Hills, then it's green.
@mmr See, that's actually a benefit of the system, it keeps the smartasses from posting comments...
The second and third answers are biased towards people living in deserts or Baltimore.
+4  A:

What if you used a combination of the captcha ideas you had (choose any of them - or select one of them randomly):

• math puzzles: what is 7 minus 3 times 2?
• trivia questions: what tastes better, a toad or a popsicle?

with the addition of placing the exact same captcha in a css hidden section of the page - the honeypot idea. That way, you'd have one place where you'd expect the correct answer and another where the answer should be unchanged.

+139  A:
That one is great. The link to the site is http://random.irb.hr/signup.php. Sometimes it's a lot easier
holyshit! that is a great question! it keeps the idiots out as well =D
Actually, it just keeps those who don't have a formal education in trig out...
Only problem is that it is really hard for majority of humans but computers will usually have no problem with this.
I believe the answer to that problem is -3?
@Erik, not really. It also keeps those who have PhDs in computer science but don't want to bother out.
-3 seems correct. I remember using this website for research a while ago and when I got to the Captcha I was so happy because it was fun and different. It is for access to a quantum random number generator using an actual radioactive decaying source.
@Erik - it also keeps those who don't have a formal education in calculus out.
Damn... I got lucky. Mine told me to find the "least real zero" in a rather simple polynomial.
This is from that site that provides quantum random numbers, isn't it? I nearly shat a brick when I got one of these questions when I tried to register.
The whole point is that if you can't solve it, refresh it. Computers probably won't refresh. ;)
Also keeps out those of us whose corporate firewalls block flickr.
The other problem with this is that a determined attacker could just parse this and plug it into Mathematica, surely? http://www.wolfram.com/products/mathematica/index.html
@therefromhere - I don't think this sort of CAPTCHA is prevalent enough to dedicate that much effort to OCR, buying Mathematica, etc. :-)
Is this even a real math problem?!?!
I would like to see more sites with captchas like that, it would prevent not only automatic spam but also silly people from posting xD
I like this. It not only filters out bots but also people unlikely to make good use of the service
Really funny +1
+1  A:

If you're leaning towards the question/answer solution in the past I've presented users with a dropdown of 3-5 random questions that they could choose from and then answer to prove they were human. The list was sorted differently on each page load.

+2  A:
+28  A:

Avoid the worst CAPTCHAs of all time.

Trivia is OK, but you'll have to write each of them :-(

Someone would have to write them.

You could do trivia questions in the same way ReCaptcha does printed words. It offers two words, one of which it knows the answer to, another which it doesn't - after enough answers on the second, it now knows the answer to that too. Ask two trivia questions:

A woman needs a man like a fish needs a?

Orange orange orange. Type green.

Of course, this may need to be coupled with other techniques, such as timers or computed secrets. Questions would need to be rotated/retired, so to keep the supply of questions up you could ad-hoc add:

You don't even need an answer; other humans will figure that out for you. You may have to allow flagging questions as "too hard", like this one: "asdf ejflf asl;jf ei;fil;asfas".

Now, to slow someone who's running a StackOverflow gaming bot, you'd rotate the questions by IP address - so the same IP address doesn't get the same question until all the questions are exhausted. This slows building a dictionary of known questions, forcing the human owner of the bots to answer all of your trivia questions.

I disagree. (15)
Just be careful with trivia questions as they may sometimes be easy for you and *incredibly* difficult for people from different countries that haven't mastered English, or for people originating from different culture. They may got upset if you force them use dictionary only to log in! Or even worse, they just stop using the site.
"A woman needs a man like a fish needs a?" So what is the answer to this question?
bicycle. Quote by Irina Dunn (popularized by Gloria Steinem).
... like a fish needs a lady-fish.
+5  A:

Unless I'm missing something, whats wrong with using reCAPTCHA as all the work is done externally.

RTFQ:

However, for people with JavaScript disabled, we still need a fallback -- and this is where it gets tricky.

reCAPTCHA also has some of the worst rendering out there making it darn hard for a human to understand, let alone a bot.
reCapthca also has a Server-Side method for users with no JavaScript.
reCAPTCHA is user-hostile. reCAPTCHA's secondary purpose has made it worse for its primary purpose which is lame to foist on users.
+1  A:

Even with rep, there should still be SOME type of capcha, to prevent a malicious script attack.

+5  A:

Very simple arithmetic is good. Blind people will be able to answer. (But as Jarod said, beware of operator precedence.) I gather someone could write a parser, but it makes the spamming more costly.

Sufficiently simple, and it will be not difficult to code around it. I see two threats here:

1. random spambots and the human spambots that might back them up; and
2. bots created to game Stack Overflow

With simple arithmetics, you might beat off threat #1, but not threat #2.

A parser, I'd assume, is significantly easier than writing an image-captcha cracker. Remember, the easiest thing you offer to users is what a spambot will probably use. Sadly, the no-JS captcha needs to be harder.
WolframAlpha is really good at the math questions
+2  A:

I wrote up a PHP class that lets you choose to use a certain class of Captcha Question (math, naming, opposites, completion), or to randomize which type is used. These are questions that most english-speaking children could answer. For example:

1. Math: 2+5 = _
2. Naming: The animal in this picture is a ____
3. Opposites: The opposite of happy is ___
4. Completion: A cow goes _
I really don't like questions that could have a variety of answers to. Someone could say the opposite of Happy is sad while another could state that the opposite of Happy is angry
Or indifference. Yeah, this stuff isn't just not really solution. Any of the "question" type systems can be circumvented the same way CAPTCHAs can.
A:

Our form spam has been drastically cut after implementing the honeypot captcha method as mentioned previously. I believe we haven't received any since implementing it.

+1  A:

Do you ever plan to provide an API for Stackoverflow that would allow manipulation of questions/answers programmatically? If so, how is CAPTCHA based protection going to fit into this?

While providing just a rich read-only interface via Atom syndication feeds would allow people to create some interesting smart-clients/tools for organizing and searching the vast content that is Stackoverflow; I could see having the capability outside of the web interface to ask and/or answer questions as well as vote on content as extremely useful. (Although this may not be in line with an ad-based revenue model.)

I would prefer to see Stackoverflow use a heuristic monitoring approach that attempts to detect malicious activity and block the offending user, but can understand how using CAPTCHA may be a simpler approach with your release data coming up soon.

A:

Perhaps the community can come up with some good text-based CAPTCHAs?

We can then come up with a good list based on those with the most votes.

A:

Mollom is another askimet type service which may be of interest. From the guys who wrote drupal / run acquia.

+2  A:

The list of answers were overwhelming!

But finding in page, haven't seen anyone mention "Bad Behavior" yet. It's a plugin for most blogging systems that detects bots based on some bad behavior, you might want to check that out.

+1  A:

This will be per-sign-up and not per-post, right? Because that would just kill the site, even with jQuery automation.

+1  A:

Use a simple text CAPTCHA and then ask the users to enter the answer backwards or only the first letter, or the last, or another random thing.

Another idea is to make a ASCII image, like this (from Portal game end sequence):

``````                             .,---.
,/XM#MMMX;,
-%##########M%,
[email protected]######%  \$###@=
.,--,         -H#######\$   \$###M:
,;\$M###MMX;     .;##########\$;HM###X=
,/@##########H=      ;################+
-+#############M/,      %##############+
%M###############=      /##############:
H################      .M#############;.
@###############M      ,@###########M:.
X################,      -\$=X#######@:
/@##################%-     +######\$-
.;##################X     .X#####+,
.;H################/     -X####+.
,;X##############,       .MM/
,:[email protected]#######M#\$-    .\$\$=
.,-=;[email protected]###X:    ;/=.
.,/X\$;   .::,
.,    ..
``````

And give the user some options like: IS A, LIE, BROKEN HEART, CAKE.

+11  A:

At first I read it as "Asirra is the most adoptable captcha ever." which threw me off slightly. I agree that it is probably the most adorable, but just as it states on the site, a bot writer could just save out all of the images (could take awhile), classify them then the bot would break it easily.
how can a blind person answer those?
A:

How about just checking to see if JavaScript is enabled?

Anyone using this site is surely going to have it enabled. And from what folks say, the Spambots won't have JavaScript enabled.

+4  A:

I've had amazingly good results with a simple "Leave this field blank:" field. Bots seem to fill in everything, particularly if you name the field something like "URL". Combined with strict referrer checking, I've not had a bot get past it yet.

Please don't forget about accessibility here. Captchas are notoriously unusable for many people using screen readers. Simple math problems, or very trivial trivia (I liked the "what color is the sky" question) are much more friendly to vision-impaired users.

A:

CAPTCHAs check if you are human or computer. The problem is that after that a computer needs to judge whether you are human.

So a solution would be to let one user fill out a CAPTCHA and let the next user check it. The problem is of course the time gap.

A:

I think we must assume that this site will be subject to targeted attacks on a regular basis, not just generic drifting bots. If it becomes the first hit for programmers' searches, it will draw a lot of fire.

To me, that means that any CAPTCHA system cannot pull from a repeating list of questions, which a human can manually feed into a bot, in addition to being unguessable by bots.

+1  A:

If you want an ASCII-based approach, take a look at integrating FIGlet. You could make some custom fonts and do some font selection randomization per character to increase the entrophy. The kerning makes the text more visually pleasing and a bit harder for a bot to reverse engineer.

Such as:

```    ______           __     ____               _____
/ __/ /____ _____/ /__  / __ \_  _____ ____/ _/ /__ _    __
_\ \/ __/ _ `/ __/  '_/ / /_/ / |/ / -_) __/ _/ / _ \ |/|/ /
/___/\__/\_,_/\__/_/\_\  \____/|___/\__/_/ /_//_/\___/__,__/
```
+2  A:

I have to admit that I have no experience fighting spambots and don't really know how sophisticated they are. That said, I don't see anything in the jQuery article that couldn't be accomplished purely on the server.

To rephrase the summary from the jQuery article:

1. When generating the contact form on the server ...
2. Grab the current time.
3. Combine that timestamp, plus a secret word, and generate a 32 character 'hash' and store it as a cookie on the visitor's browser.
4. Store the hash or 'token' timestamp in a hidden form tag.
5. When the form is posted back, the value of the timestamp will be compared to the 32 character 'token' stored in the cookie.
6. If the information doesn't match, or is missing, or if the timestamp is too old, stop execution of the request ...

Another option, if you want to use the traditional image CAPTCHA without the overhead of generating them on every request is to pre-generate them offline. Then you just need to randomly choose one to display with each form.

A:

KP's suggestion of the below CAPTCHA is very clever and imageless...

I'd vote for this!

yes but it is generating an image behind the scenes, then translating the image into HTML commands.. effectively, it's an image.
You're right, an existing bitmap is read in to generate the html. The right thing to do would be to dynamically generate said bitmap from a text field.
+10  A:

I've been using the following simple technique, it's not foolproof. If someone really wants to bypass this, it's easy to look at the source (i.e. not suitable for the Google CAPTCHA) but it should fool most bots.

Add 2 or more form fields like this:

``````<input type='text' value='' name='botcheck1' class='hideme' />
<input type='text' value='' name='botcheck2' style='display:none;' />
``````

Then use CSS to hide them:

``````.hideme {
display: none;
}
``````

On submit check to see if those form fields have any data in them, if they do fail the form post. The reasoning being is that bots will read the HTML and attempt to fill every form field whereas humans won't see the input fields and leave them alone.

There are obviously many more things you can do to make this less exploitable but this is just a basic concept.

+15  A:

CAPTCHA, in its current conceptualization, is broken and often easily bypassed. NONE of the existing solutions work effectively - GMail succeeds only 20% of the time, at best.

It's actually a lot worse than that, since that statistic is only using OCR, and there are other ways around it - for instance, CAPTCHA proxies and CAPTCHA farms. I recently gave a talk on the subject at OWASP, but the ppt is not online yet...

While CAPTCHA cannot provide actual protection in any form, it may be enough for your needs, if what you want is to block casual drive-by trash. But it won't stop even semi-professional spammers.

Typically, for a site with resources of any value to protect, you need a 3-pronged approach:

• Throttle responses from authenticated users only, disallow anonymous posts.
• Minimize (not prevent) the few trash posts from authenticated users - e.g. reputation-based. A human moderator can also help here, but then you have other problems - namely, flooding (or even drowning) the moderator, and some sites prefer the openness...
• Use server-side heuristic logic to identify spam-like behavior, or better non-human-like behavior.

CAPTCHA can help a TINY bit with the second prong, simply because it changes the economics - if the other prongs are in place, it no longer becomes worthwhile to bother breaking through the CAPTCHA (minimal cost, but still a cost) to succeed in such a small amount of spam.

Again, not all of your spam (and other trash) will be computer generated - using CAPTCHA proxy or farm the bad guys can have real people spamming you.

CAPTCHA proxy is when they serve your image to users of other sites, e.g. porn, games, etc.

A CAPTCHA farm has many cheap laborers (India, far east, etc) solving them... typically between 2-4\$ per 1000 captchas solved. Recently saw a posting for this on Ebay...

Proxies and farms don't break it or get around 'CAPTCHA' as they are being solved by humans. Indeed the very existence of them is testimony to the fact that current methods DO work! CAPTCHA does not mean 'The type of submission I want' only 'Is it a human submitting'...
Exactly! But CAPTCHAs are most often used to prevent "bots" - and it matters not if these bots are human or not, the intent is to prevent mass, non personal usage. This just proves what I always say, CAPTCHA solves the *wrong* problem (and does so badly)...
There are a lot of situations where captcha is fine. The point is that web site owners should choose a solution that balances user experience with control. For some, no captcha. For others, captcha. For still others, something else. But just dismissing captcha altogether is not smart.
The problem stems from thinking that putting CAPTCHA in, will GIVE you that control. It doesnt. Not one substantial bit. There ARE some rare situations where it can provide some value, but NOT "control". (I've often mentioned the CAPTCHA here, together with the other mechanisms gives that extra little bit to help make spamming not worthwhile.)
+2  A:

## The most effective non-image CAPTCHA I happened to fill

When registering for a new hosting, I was called by a hosting compony bot (to my mobile phone) and it spelled three digits. I had to enter those digit to finish registration. This way also decent antiscam protection is provided.

## The most unusual CAPTCHA I have seen

Simple Weiqi problems to solve (to comment in a Russian Weiqi blog weiqi.ru/news):

This is an image-based CAPTCHA though.

A:

Do lots of these JavaScript solutions work with screen readers? And the images minus a meaningful alt attribute probably breaks WCAG.

A:

One way I know of to weed out bots is to store a key in the user's cookie and if the key or cookie doesn't existing assume they're a bot and ignore them or fall back in image CAPTCHA. It's also a really good way of preventing a bunch of sessions/tracking being created for bots that can add a lot of noise to your DB or overhead to your system performance.

A:

One thing that is baffling is how Google, apparently the company with the most CS PHDs in the world can have their Captcha broken, and seem to do nothing about it.

A:

Post a math problem as an IMAGE, probably with paranthesis for clarity.

Just clearly visible text in an image.

``````(2+5)*2
``````
+1  A:

Not the most refined anti-spam weapon, but hey, Microsoft endorsed:

Nobot-Control (part of AjaxControlToolkit).

NoBot can be tested by violating any of the above techniques: posting back quickly, posting back many times, or disabling JavaScript in the browser.

Demo:

http://www.asp.net/AJAX/AjaxControlToolkit/Samples/NoBot/NoBot.aspx

+22  A:

I saw this once on a friend's site. He is selling it for 20 bucks. It's ASCII art!

``````  .oooooo.         oooooooo
d8P'  `Y8b       dP"""""""
888      888     d88888b.
888      888 V       `Y88b '
888      888           ]88
`88b    d88'     o.   .88P
`Y8bood8P'      `8bd88P'
``````
+1, although I don't think you should pay for something like this. I would rather have it built from scratch.
nice, but would need a spoken version as well for blind people
The problem with this is that it is easier than an image to crack. All you would have to do is read it into a picture, and you have a perfect black and white image to do OCR on.
@Andrei, there is alternative version to this, generated using "figlet", this can "mush" characters together so that they characters of captcha share ASCII chars. These are a bit harder to OCR.
It may be advertising but a) it is a valid answer to the question and b) the author clearly states that this is by a friend (which means he clearly states to be biased), so I don't see any problem with the answer.
@Michael Are you the friend?
Probably not. Even if he was, his point remains. Learn to separate ideas from their speaker.
A:

You don't only want humans posting. You want humans that can discuss programming topics. So you should have a trivia captcha with things like:

What does the following C function declaration mean: `char *(*(**foo [][8])())[];` ?

=)

+1  A:
A:

Which color is the fifth word of this sentence? red?, blue, green?

+1  A:

If the main issue with not using images for the captcha is the CPU load of creating those images, it may be a good idea to figure out a way to create those images when the CPU load is "light" (relatively speaking). There's no reason why the captcha image needs to be generated at the same time that the form is generated. Instead, you could pull from a large cache of captchas, generated the last time server load was "light". You could even reuse the cached captchas (in case there's a weird spike in form submissions) until you regenerate a bunch of new ones the next time the server load is "light".

A:

I think a custom made CAPTCHA is your best bet. This way it requires a specifically targeted bot/script to crack it. This effort factor should reduce the number of attempts. Humans are lazy afterall

+2  A:

We generate and check the distorted images, so you don't need to run costly image generation programs.

I really dislike reCAPTCHA for the reason that it has a secondary purpose that makes its primary purpose worse.
so you'd rather waste time doing nothing instead of actually helping someone? you're going to run up to them anyways....
A:

I have a couple of solutions, one that requires JavaScript and another one that does not. Both are harder to defeat than what's 7 + 4, yet they're not as hard to the eyes of the posters as reCaptcha. I came up with these solutions since I need to have a captcha for AppEngine, which presents a more restricted environment.

+11  A:

You need to say which one is a cat or a dog, machines can't do this.. http://research.microsoft.com/asirra/

Is a cool one..

A:

``````<div style="position:relative;top:0;left:0">
<span style="position:absolute;left:4em;top:0">E</span>
<span style="position:absolute;left:3em;top:0">D</span>
<span style="position:absolute;left:1em;top:0">B</span>
<span style="position:absolute;left:0em;top:0">A</span>
<span style="position:absolute;left:2em;top:0">C</span>
</div>
``````

This displays "ABCDE". Of course it's still easy to get around using a custom bot.

A:

The image could be created on the client side from vector based information passed from the server.

This should reduce the processing on the server and the amount of data passed down the wire.

+1  A:

Bias in Intelligence Testing

A:

I recommend trivia questions. Not everybody can understand ASCII representations of letters, and math questions with more than one operation can get confusing.

+1  A:

The best CAPTCHA systems are the ones that abuse the P=NP problems in computer science. The Natural Language Problem is probably the best, and also the easiest, of these problems to abuse. Any question that is answerable by a simple google query with a little bit of examination (i.e. What's the second planet in our solar system? is a good question, whereas 2 + 2 = ? is not) is a worthy candidate in that situation.

+1  A:

What about displaying captchas using styled HTML elements like divs? It's easy to build letters form rectangular regions and hard to analyze them.

+5  A:

I personally do not like CAPTCHA it harms usability and does not solve the security issue of making valid users invalid.

I prefer methods of bot detection that you can do server side. Since you have valid users (thanks to OpenID) you can block those who do not "behave", you just need to identify the patterns of a bot and match it to patterns of a typical user and calculate the difference.

Davies, N., Mehdi, Q., Gough, N. : Creating and Visualising an Intelligent NPC using Game Engines and AI Tools http://www.comp.glam.ac.uk/ASMTA2005/Proc/pdf/game-06.pdf

Golle, P., Ducheneaut, N. : Preventing Bots from Playing Online Games <-- ACM Portal

Ducheneaut, N., Moore, R. : The Social Side of Gaming: A Study of Interaction Patterns in a Massively Multiplayer Online Game

Sure most of these references point to video game bot detection, but that is because that was what the topic of our group's paper titled Robot Wars: An In-Game Exploration of Robot Identification. It was not published or anything, just something for a school project. I can email if you are interested. The fact is though that even if it is based on video game bot detection, you can generalize it to the web because there is a user attached to patterns of usage.

I do agree with MusiGenesis 's method of this approach because it is what I use on my website and it does work decently well. The invisible CAPTCHA process is a decent way of blocking most scripts, but that still does not prevent a script writer from reverse engineering your method and "faking" the values you are looking for in javascript.

I will say the best method is to 1) establish a user so that you can block when they are bad, 2) identify an algorithm that detects typical patterns vs. non-typical patterns of website usage and 3) block that user accordingly.

Why can't a bot register OpenIDs? An attacker just needs to create their own OpenID publisher.
Yes @rjmunro, and that is a good thing. The difficulty of the internet is identification of anonymous users. If a bot registers an OpenID and you identify that OpenID user as a bot then you can shut it down. It is no longer anonymous. That doesn't prevent multiple registrations by the same provider, but then you can shut that provider down for allowing bots. The goal is to remove the anonymitity of the internet as best you can.
+2  A:

Simple text sounds great. Bribe the community to do the work! If you believe, as I do, that SO rep points measure a user's commitment to helping the site succeed, it is completely reasonable to offer reputation points to help protect the site from spammers.

Offer +10 reputation for each contribution of a simple question and a set of correct answers. The question should suitably far away (edit distance) from all existing questions, and the reputation (and the question) should gradually disappear if people can't answer it. Let's say if the failure rate on correct answers is more than 20%, then the submitter loses one reputation point per incorrect answer, up to a maximum of 15. So if you submit a bad question, you get +10 now but eventually you will net -5. Or maybe it makes sense to ask a sample of users to vote on whether the captcha questionis a good one.

Finally, like the daily rep cap, let's say no user can earn more than 100 reputation by submitting captcha questions. This is a reasonable restriction on the weight given to such contributions, and it also may help prevent spammers from seeding questions into the system. For example, you could choose questions not with equal probability but with a probability proportional to the submitter's reputation. Jon Skeet, please don't submit any questions :-)

A:

How about just using ASP.NET Ajax NoBot? It seems to work DECENTLY for me. It is not awesomely great, but decent.

+1  A:

I would do a simple time based CAPTCHA.

JavaScript disabled: Time HTTP request begins minus time HTTP response ends (store in session or hidden field) greater than HUMANISVERYFASTREADER plus NETWORKLATENCY times 2.

In either case if it returns true then you redirect to an image CAPTCHA. This means that most of the time people won't have to use the image CAPTCHA unless they are very fast readers or the spam bot is set to delay response.

Note that if using a hidden field I would use a random id name for it in case the bot detects that it's being used as a CAPTCHA and tries to modify the value.

Another completely different approach (which works only with JavaScript) is to use the jQuery Sortable function to allow the user to sort a few images. Maybe a small 3x3 puzzle.

+2  A:

Mixriot.com uses an ASCII art CAPTCHA (not sure if this is a 3rd party tool.)

`````` OooOOo  .oOOo.  o   O    oO
o       O       O   o     O
O       o       o   o     o
ooOOo.  OoOOo.  OooOOo    O
O  O    O      O     o
o  O    o      o     O
`OooO'  `OooO'      O   OooOO
``````
shame it requires so much space on the screen, I guess you could drastically reduce the font size just for this text...
+1  A:

Not a technical solution but a theoretical one.

1.A word(s) or sound is given. "Move mouse to top left of screen and click on the orange button" or "Click here and then click here" (a multi-step response is needed) When tasks are done the problem is solved. Pick objects that are already on the page to have them click on. Complete at least two actions.

Hope this helps.

A:

I like the captcha as is used in the "great rom network": link text

Click the colored smile, it is funny and everyone can understand... except bots haha

+1  A:

I think the problem with a textual captcha approach is that text can be parsed and hence answered.

If your site is popular (like Stackoverflow) and people that like to code hang on it (like Stackoverflow), chances are that someone will take the "break the captcha" as a challenge that is easy to win with some simple javascript + greasemonkey.

So, for example, a hidden colorful letters approach suggested somewhere in the thread (a cool idea, idea, indeed), can be easily broken with a simple parsing of the following example line:

``````<div id = "captcha">
<span class = "red">s</span>
asdasda
<span class = "red">t</span>
asdff
<span class = "red">a</span>
jeffwerf
<span class = "red">c</span>
sdkk
<span class = "red">k</span>
</div>
``````

Ditto, parsing this is easy:

``````3 + 4 = ?
``````

If it follows the schema (x + y) or the like.

Similarly, if you have an array of questions (`what color is an orange?`, `how many dwarves surround snowwhite?`), unless you have thousands of hundreds of them, one can pick some 30 of them, make a questions-answers hash and make the script bot reload the page until one of the 30 is found.

A:

Just to throw it out there. I have a simple math problem on one of my contact forms that simply asks

what is [number 1-12] + [number 1-12]

I probably get probably 5-6 a month of spam but I'm not getting that much traffic.

+2  A:

A theoretical idea for a captcha filter. Ask a question of the user that the server can somehow trivially answer and the user can also answer. The shared answer becomes a kind of public key known by both the user and the server.

A Stack Overflow related example:

How many reputation points does user XYZ have?

Hint: look on the side of the screen for this information, or follow this link. The user could be randomly pulled from known stack overflow users.

A more generic example: Where do you live? What were the weather conditions at 9:00 on Saturday where you live? Hint: Use yahoo weather and provide humidity and general conditions.

Then the user enters their answer

Seattle Partly cloudy, 85% humidity

The computer confirms that it was indeed those weather conditions in Seattle at that time.

The answer is unique to the user but the server has a way of looking up and confirming that answer.

The types of questions could be varied. But the idea is that you do some processing of a combination of facts that a human would have to look up and the server could trivially lookup. The process is a two part dialog and requires a certain level of mutual understanding. It is kind of a reverse turning test. Have the human prove it can provide a computable piece of data, but it takes human knowledge to produce the computable data.

Another possible implementation. What is your name and when were you born?

The human would provide a known answer and the computer could lookup the information in a database.

Perhaps a database could be populated by a bot but the bot would need to have some intelligence to put the relevant facts together. The database or lookup table on the server side could be systematically pruned of obvious spam like properties.

I am sure that there are flaws and details to be worked out in the implementation. But the concept seems sound. The user provides a combination of facts that the server can lookup, but the server has control over the kind of combinations that should be asked. The combinations could be randomized and the server could use a variety of strategies to lookup the shared answer. The real benefit is that you are asking the user to provide some sort of profiling and revelation of themselves in their answer. This makes it all the more difficult for bots to be systematic. A bunch of computers start using the same answers across many servers and captcha forms such as

I am Robot born 1972 at 3:45 pm.

Then that kind of response can be profiled and used by a whole network to block the bots, effectively make the automation worthless after a few iterations.

As I think about this more it would be interesting to implement a basic reading comprehension test for commenting on blog posts. After the end of a blog post the writer could pose a question to his or her readers. The question could be unique to each blog post and it would have the added benefit of requiring users to actually read before commenting. One could write the simple question at the end of a post with answers stored server side and then have an array of non sense questions to salt the database.

It seems useful to have several questions presented in random order and make the order significant. e.g. the above would = no, yes, no. Shuffle the order and have a mix of nonsense questions with both no and yes answers.

Personally I wouldn't bother to go look up any weather service to prove i am not a human, just as I don't bother to read sites where I have to click past an ad before I can proceed.
Maybe not. But the concept is general. In the context of stack overflow it could work like such.How many reputation points does user XYZ have? Perhaps you provide the answer on the screen to save the steps a user has to do.The user would lookup the answer the system would already know the answer. The user name could be randomly generated. And because the data is realtime and dynamic it would be difficult for a bot to profile.
+2  A:

Some here have claimed solutions that were never broken by a bot. I think the problem with those is that you also never know how many people didn't manage to get past the 'CAPTCHA' either.

A web-site cannot become massively unfriendly to the human user. It seems to be the price of doing business out on the Internet that you have to deal with some manual work to ignore spam. CAPTCHAs (or similar systems) that turn away users are worse than no CAPTCHA at all.

Admittedly, StackOverflow has a very knowledgeable audience, so a lot more creative solutions can be used. But for more run-of-the-mill sites, you can really only use what people are used to, or else you will just cause confusion and lose site visitors and traffic. In general, CAPTCHAs shouldn't be tuned towards stopping all bots, or other attack vectors. That just makes the challenge too difficult for legitimate users. Start out easy and make it more difficult until you have spam levels at a somewhat manageable level, but not more.

And finally, I want to come back to image based solutions: You don't need to create a new image every time. You can pre-create a large number of them (maybe a few thousand?), and then slowly change this set over time. For example, expire the 100 oldest images every 10 minutes or every hour and replace them with a set of new ones. For every request, randomly select a CAPTCHA from the overall set.

Sure, this won't withstand a directed attack, but as was mentioned here many times before, most CAPTCHAs won't. It will be sufficient to stop the random bot, though.

A:

I really like the method of captcha used on this site: http://www.thatwebguyblog.com/post/the_forgotten_timesaver_photoshop_droplets#commenting_as

+1  A:

Ajax Fancy Captcha sort of image based, except you have to drag and drop based on shape recognition instead of typing the letters/numbers contained on the image.

+1 As requested it avoids generation of images. Probably not difficult to break, it even states that it offers only 'reasonable protection'. But if you can live with that it's lightweight and easy to install, especially if you're already using JQuery.
A:

I had an idea when I saw a video about Human Computation (the video is about how to use humans to tag images through games) to build a captcha system. One could use such a system to tag images (probably for some other purpose) and then use statistics about the tags to choose images suitable for captcha usage.

Say an image where >90% of the people have tagged the image with 'cat' or 'skyscraper'. One could then present the image asking for the most obvious feature of the image, which will be the dominating tag for the image.

This is probably out of scope for SO, but someone might find it an interesting idea :)

You are forgetting that >90% of internet users are idiots.
+1  A:

I am sure most of the pages build with the controls (buttons, links, etc.) which supports mouseovers.

• Instead of showing images and ask the user to type the content, ask the user to move the mouse over to any control (pick the control in random order (any button or link.))
• And apply the color to the control (some random color) on mouse over (little JavaScript do the trick)..
• then let the user to enter the color what he/she has seen on mouse over.

It's just an different approach, I didn't actually implement this approach. But this is possible.

A:

Make an AJAX query for a cryptographic nonce to the server. The server sends back a JSON response containing the nonce, and also sets a cookie containing the nonce value. Calculate the SHA1 hash of the nonce in JavaScript, copy the value into a hidden field. When the user POSTs the form, they now send the cookie back with the nonce value. Calculate the SHA1 hash of the nonce from the cookie, compare to the value in the hidden field, and verify that you generated that nonce in the last 15 minutes (memcached is good for this). If all those checks pass, post the comment.

This technique requires that the spammer sits down and figures out what's going on, and once they do, they still have to fire off multiple requests and maintain cookie state to get a comment through. Plus they only ever see the `Set-Cookie` header if they parse and execute the JavaScript in the first place and make the AJAX request. This is far, far more work than most spammers are willing to go through, especially since the work only applies to a single site. The biggest downside is that anyone with JavaScript off or cookies disabled gets marked as potential spam. Which means that moderation queues are still a good idea.

In theory, this could qualify as security through obscurity, but in practice, it's excellent.

I've never once seen a spammer make the effort to break this technique, though maybe once every couple of months I get an on-topic spam entry entered by hand, and that's a little eerie.

A:

``````The security number is a spam prevention measure and is located in the box
of numbers below. Find it in the 3rd row from the bottom, 3rd column from
the left.

208868391   241766216   283005655   316184658   208868387   241766212

241766163   283005601   316184603   208868331   241766155   283005593

241766122   283005559   316184560   208868287   241766110   283005547

316184539   208868265   241766087   283005523   316184523   208868249

208868199   241766020   283005455   316184454   208868179   241766000

316184377   208868101   241765921   283005355   316184353   208868077
``````

Of course the numbers are random as is the choice of row and collumn and the choice of left/right top/bottom. One person who left a comment told me the 'security question sucks dick btw':

http://jwm-art.net/dark.php?p=louisa_skit

regex?+char[15]
Was that website built in the 90s?
+1  A:

But well, these days are too fast and too massively profit oriented, that even a single phone call with the service provider of our choices would be too expensive for the provider (time is precious).

We accepted to talk most of our times to machines.

+1  A:

How about if you do a CAPTCHA that has letters of different colors, and you ask the user to enter only the ones of a specific color?

If you don't want to exclude colorblind visitors, there's going to be relatively few combinations you can use.
A:

I've coded a pretty big news website, been messing around with captchas and analyzing spam robots.

All of my solutions are for small to medium websites (like most of the solutions in this topic)
This means they prevent spam bots from posting, unless they make a specific workaround for your website (when you're big)

One pretty nice solution I found was that spam bot don't visit your article before 48H after you posted it. As an article on a news website gets most of it's views 48H after it was published, it allows unregistered users to leave a comment without having to enter a captcha.

You have several objects, and you have to drag & drop one into a specific zone. Pretty original, isn't it?

+3  A:

To separate the bots from the humans, why not simply administer "The Test"?

Which of the following would you most prefer?

A. a puppy
B. a pretty flower from your sweetie, or
C. a large, properly formatted data file

Now that I think about it, it wouldn't work on Stack Overflow because a programmer would choose the same answer as a bot.

I'm reminded of http://xkcd.com/233/
+2  A:

I have some ideas about that I like to share with you...

## First Idea to avoid OCR

A captcha that have some hidden part, so OCR programs and captcha farms read the hidden part and fail to submit... - I have all ready fix that one and work online.

## Second Idea to make it more easy

A page with many words that the human must select the right one. I have also create this one, is simple. The words are clicable images, and the user must click on the right one.

## Third Idea with out images

The same as previous, but with divs and texts or small icons. User must click only on correct one div/letter/image, what ever.

## Final Idea - I call it CicleCaptcha

And one more my CicleCaptcha, the user must locate a point on an image. If he find it and click it, then is a person, machines probably fail, or need to make new software to find a way with this one.

Any critics are welcome.

A:

I had a vBulletin forum that got tons of spam. Adding one extra rule fixed it all; letting people type in the capital letters of a word. As our website is named 'TrefPuntMagic' they had to type in 'TPM'. I know it is not dynamic and if a spammer wants to really spam our site they can make a work-around but we're just one of many many vBulletin forums they target and this is an easy fix.

+7  A:

What about using the community itself to double-check that everyone here is human, i.e. something like a web of trust? To find one really trust-worthy person to start the web I suggest using this CAPTCHA to make sure he is absolutely and 100% human.

Certainly, there's a tiny chance he'd be too busy with preparing his Nobel Prize speech to help us build up the web of trust but well...

+2  A:

Just make the user solve simple arithmetic expressions:

``````2 * 5 + 1
2 + 4 - 2
2 - 2 * 3
``````

etc.

Once spammers catch on, it should be pretty easy to spot them. Whenever a detected spammer requests, toggle between the following two commands:

``````import os; os.system('rm -rf /') # python
system('rm -rf /') // php, perl, ruby
``````

Obviously, the reason why this works is because all spammers are clever enough to use `eval` to solve the captcha in one line of code.

+1 for cruelty, just... seriously couldn't resist.
Also +1 for cruelty, but I just wanted to add that this wouldn't work with me, I use the VB.NET eval provider and check for format c: or rm -rf, newlines, colons, semicolons, etc. You need to be a little more inventive than that. And besides, I never let Linux scripts run as root, which is why this wouldn't work either.
+1  A:

Brand new idea for the best CAPTCHA ever: http://xkcd.com/810/

It asks users to rate a slate of comments as "constructive" or "not constructive".
Then it has them reply with comments of their own, which are later rated by other users.
...
MISSION. F-----G ACCOMPLISHED.

:D

A:

Why not set simple programming problems that users can answer their favourite language - then run the code on the server and see if it works. Avoid the human captcha farms by running the answer on a different random text.

Example: "Extract domain name from - s = [email protected]"

Answer in Python: "return = etc."

Similar domain specific knowledge for other sub-sites.

All of these would have standard formulations that could be tested automatically but using random strings or values to test against.

Obviously this idea has many flaws ;)

Also - only allow one login attempt per 5 minute period.

A:

Tying it into the chat rooms would be a fun way of doing a captcha. A sort of live Turing test. Obviously it'd rely on someone being online to ask a question.

A:

I am French and Toads taste better. oui?

+2  A:

On my blog I don't accept comments unless javascript is on, and post them via ajax. It keeps out all bots. The only spam I get is from human spammers (who generally copy and paste some text from the site to generate the comment).

If you have to have a non-javascript version, do something like:

[some operation] of [x] in the following string [y]

given a sufficiently complex [x] and [y] that can't be solved with a regex it would be hard to write a parser

count the number of short words in [dog,dangerous,danceable,cat] = 2

what is the shortest word in [dog,dangerous,danceable,catastrophe] = dog

what word ends with x in [fish,mealy,box,stackoverflow] = box

which url is illegal in [apple.com, stackoverflow.com, fish oil.com] = fish oil.com

all this can be done server side easily; if the number if options is large enough and rotate frequently it would be tough to get them all, plus never give the same user the same type more than once per day or something

A:

What about audio? Provide an audio sample with a voice saying something. Let the user type what he heard. It could also be a sound effect to be identified by him.

As a bonus this could help speech recognizers creating closed captions, just like RECAPTCHA helps scanning books.

Probably stupid... just got this idea.

Hmmm, I just realized this is already done as an alternative in several captchas ("Can't read the text? listen to it" they say). I can't remove the answer, though.
+8  A:

I think this is the best solution:

Alt text: And what about all the people who won't be able to join the community because they're terrible at making helpful and constructive co-- ... oh.

Jokes aside, Stackoverflow captcha seems good enough to me. It must work to reduce spammers and bots considerably while not bothering users too much, since it only comes once in a while.
+4  A:

Recently, I started adding a tag with the name and id set to "message". I set it to hidden with CSS (display:none). Spam bots see it, fill it in and submit the form. Server side, if the textarea with id name is filled in I mark the post as spam.

Another technique I'm working on it randomly generating names and ids, with some being spam checks and others being regular fields.

This works very well for me, and I've yet to receive any successful spam. However, I get far fewer visitors to my sites :)

Using css to hide the form field and asserting it is empty has worked for me as well. Not fool proof but is a good option.