views:

98

answers:

4

I am a bit confused about how reCAPTCHA works. I have implemented it using ROR.

Sometimes even if i specify only one word out of two, it returns true while sometimes it fails.

I am really confused and not able to understand the behaviour of reCAPTCHA.

+6  A: 

Only one of the recaptcha words is "known" by the system - it is relying on the user performing the captcha to tell the system what the other word is, because it is not machine-readable.

That is the "point" of recaptcha, or the added benefit - it is not only performing a human test, it is also massively group-sourcing translation where automated OCR has failed.

Rex M
A: 

Recaptcha uses two words, one of which is known and one which is unknown (the unknown word is the one that the program is trying to help decipher--it's probably scanned out of an old book or something somewhere!). So really, all the service is looking for is the right answer to the KNOWN word. If that's the word you put it, it will succeed even if you don't put in anything for the unknown word. If you put in the other word (the unknown one) it will fail.

burningstar4
+1  A: 

Recaptcha shows two words. One that an computer scanner has scanned and recognized and one that the computer scanner cannot recognize. Recaptcha checks for the word it knows the answer to and saves the response for the unknown word. These responses to the unknown words are compiled and analyzed so that it is essentially "solved" by humans and not by the computer scanner.

Here's more info, in their own words:

"But if a computer can't read such a CAPTCHA, how does the system know the correct answer to the puzzle? Here's how: Each new word that cannot be read correctly by OCR is given to a user in conjunction with another word for which the answer is already known. The user is then asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the new one. The system then gives the new image to a number of other people to determine, with higher confidence, whether the original answer was correct."

source - http://www.google.com/recaptcha/learnmore

programatique
Are you sure that one of the words was recognized by the scanner? Couldn't it use a word that was solved by previous users?
Greg
yes. it does use words that are solved by previous users. one word is one which the scanner can read and the other is one which the scanner cannot read. the one which the scanner does not read is solved by multiple users. this builds up a consensus as to what is the correct word (therefore the "solving" of the unknown word doesn't rest on just 1 answer.I've added the link to the recaptcha about page in my answer above.
programatique
A: 

I think that's the main point of recaptcha. It helps developers make difference between humans and robots and it also helps digitalize books.

There're always two words. One is easier to read. If you can read this word, it's fine, you're human.

The second word is a scan from a book where automatic OCR (recognition) is not sure about this word. So users are helping read this word so books can be digitalized better.

dwich
Actually both are scans, but one has already been identified by N users. Once that threshold number have identified the second word it is added to the known ones.
Martin Beckett