CAPTCHA has been created to avoid machines from detecting the words. It's meant to be read by humans only. Making it more readable for blind/deaf people adds a risk of machines being able to understand them again, thus nullifying their effect.
Spammers did find a very effective way to break the more popular CAPTCHA's though. They just hire cheap labourers to read them, in return for a few cents per working account. As a result, there's a small industry around breaking CAPTCHA's to create millions of accounts that can then be used to send more spam. Compared to the amount gained by the spammers, the costs is almost none. A similar solution could be used by blind/deaf people, who would send the CAPTCHA image to some cheap labourer in China or wherever, where they will reply with the correct words and the blind/deaf person will be able to proceed. Unfortunately, blind people only need this service only a few times while spammers need a continuous flow, thus those labourers will prefer to work for spammers instead. (The pay is better.) Still, the best solution would be to send the CAPTCHA to some friend, let them read and/or decipher it and return the answer.
The ReCAPTCHA style also reads out the words. A simple speech recognition application might be able to recognise whatever is said, although speech recognition still needs more optimizations. Still, you might want to work from that angle, getting the application to listen to the sound byte instead.
When it is possible to break CAPTCHA's, they will just think of better CAPTCHA-like methods. OCR techniques are still improving thus more work will be done to make CAPTCHA's harder. That is, until OCR has become as good as the human eye at recognizing words...
An algorithm could be created, although slow. With 26 lowercase and 26 uppercase letters and 10 digits, it should not be too difficult to come up with an algorithm. With Serif and Sans-serif fonts, the number of combinations would need to be doubled, though. Still, if you try to curve all letters in a similar way as the letter in the CAPTCHA, you should be able to detect a letter which gets covered by the CAPTCHA letter the most. And that would be the most likely candidate. Still needs you to clear lines, dirt and other artefacts from the image that the human eye has less trouble to recognise than a computer. You'd need the following steps:
- Clean up the image.
- Detect the locations of the letters.
- For every letter
3a. Determine the curve of the letter by checking the left side.
3b. Do an overlay of every possible letter/digit to find the one that covers it the best. (That's the most likely letter.)
- Once you've found the word, do a dictionary check to make sure it's a real word. (Unless the CAPTCHA doesn't use real words.)
Even though they can twist the letters in the CAPTCHA's, it should be possible to detect the twist rotation that they used simply by looking at the left side of every letter and then trying to apply the same curve to every letter. (52 combinations, plus 10 digits, if digits are also used.) Basically, you'd try to put a box around every letter, then check which letter will contain the least amount of white space. That's the most likely letter.
The main reason why this isn't often used for OCR is basically the need for speed. Step 3a/b tends to be slow, especially if you have to take font style in consideration.
Making this answer bigger but in reply to one of the comments:
There are several ways to cleanup an image. You'd need some color filtering, noise reduction and an algorithm that's able to recognise the noisy lines through an image. The DEFCON slideshow that you've pointed to shows a few simple techniques to filter away some of the noise. It shows that a basic image processing tool can already make an image a lot clearer for a machine to read. A simple blur will clean up random dots and thin lines while color filters would filter away the noisy colours. A next step would be to try to put a box around every letter in the CAPTCHA, hoping the system is able to recognise their locations. I don't know any practical algorithms for this but there should be ways to recognise them. There's software that can create vector images from bitmaps, thus there should be software that's able to calculate a box around a letter.
It is likely that this box won't have rectangular corners, thus you would have to distort all 52 letters to match the same box. Italic or bold shouldn't make much of a difference since these styles are just additional distortions. Serif or Sans-serif does make a difference, though. Serif fonts tend to have a few more spikes and ornaments. Fortunately, there are algorithms that can transform a box to any other figure with four corners.
Regular OCR applications will assume that letters are mostly straight and will just check a few hotspots to find a match. Thus, they sometimes get it wrong because of noise. To crack CAPTCHA, you would need a more sensitive match, preferably "XOR-ing" the CAPTCHA letter image with an image of one of the 52 letters, then counting the number of black and white spots to calculate the ratio. Assuming white=1 and black=0, the result of the XOR should be almost black for the best match.
I think several spammers have already found some useful algorithms to crack CAPTCHA's but for them, keeping these algorithms a secret just keeps them in business.
Another comment, more text. :-)
Segmentation would be a problem, but it's not impossible to solve. It's just extremely complex. But when you've cleaned the image, it should be possible to calculate two lines. One line that touches the bottom of every letter and a second line that touches the top. However, good CAPTCHA's won't put letters on the same lines any more, but those not-so-good ones could be cracked by just following the lines. (Guess? ReCAPTCHA puts letters between two lines!) With two lines, you know the first letter will start at the left, thus you can try overlaying all 52 possibilities there until you've found a match. When you found one, move to the right for the second one. And further until you've read all letters. With two lines to guide you, you don't need a complete box.
Letters tend to use a constant ratio between width and height. With two lines, you can calculate the height of the complete letter and thus get a good estimation of the matching width.
Still, working out the correct algorithm to calculate this all is a bit too much for my poor math skills. You'd need an expert mathematician to crack this algorithm.