Captchas broken by Stanford Researchers

Are you sure you're not a robot?

Captchas, those fun little letter-guessing mini-games that have become ubiquitous throughout the anti-robot web, are not as robot-proof as you might think. A team of Stanford researchers created a tool called DeCaptcha that uses algorithms to reconstruct the letters and numbers in a Captcha into a computer readable form. While ranges of success vary from implementation to implementation (25% for Wikipedia, 70% for Visa), Elie Bursztein, a researcher on the team, claims that if even 1% of the Captchas are breakable, the whole system needs to be thrown out.

According to Bursztein, Captchas (which stands for "Completely Automated Public Turing Test to tell Computers and Humans Apart") aren't nearly as secure as the computing public thinks they are. "Most Captchas are designed without proper testing and no usability testing. We hope our work will push people to be more rigorous in their approach in Captcha design."

Blizzard, when approached on the subject, countered that Captchas were never meant to be the ultimate security tool against bots. While the vulnerabilities exposed by the Stanford team are serious and will hopefully be investigated in due course, there is no one security barrier that will protect from every threat. Captchas, along with complex password rules, email verification and a slurry of other known and unknown security processes running in the background, create a flexible and layered security system that is able to mitigate as many threats as possible. Captcha is only one tool in many websites' security arsenals, so don't stop using Visa or Blizzard because Captcha isn't perfect.

Report a problem with article
Previous Story

Neowin Forums being upgraded (now back online)

Next Story

Motorola Mobility cutting 800 jobs, not helping the economy

38 Comments

Commenting is disabled on this article.

man the captcha i got the other night had a small screen shot of the green matrix code and 1 word i was like htf am i gonna put that in

isnt deCAPTCHA project years old already?
I remember reading about this somewhere more then 2 years ago
not to feed bots, but to prove its not working so sites would stop using these anoying sht ass captcha images everywhere -.-
I get reCAPTCHA right more then 95% of the time(its easy, common people, you just got to guess 1 out of 2 words)

And this is why I like the Captchas that make you put a picture together more, and plus they are more user friendly.

Ok, you have this site with Captcha. The site serves downloadable content and for each file you need to "solve" one Captcha quiz. Since the content is in high demand (say porn), a lot of users use your service.

Behind the scenes this site is running multiple bots. The bots cannot solve Captchas on the sites they are trying to get into, so they give them to this download site to let real users solve them. The people downloading (probably pirated) videos are then actually reading words and feeding them to bots.

If such a setup does not yet exist, I'll be surprised if it takes even a year before it happens.

Or you can be like China and pay people to answer Captchas all day so eventually you have enough to work with you can guess with about 75% accuracy.

Xilo said,
Or you can be like China and pay people to answer Captchas all day so eventually you have enough to work with you can guess with about 75% accuracy.

I only get about 75% accuracy with captchas anyway, even when I'm pretty sure I typed what it said it usually still fails me

Flatval said,
Ok, you have this site with Captcha. The site serves downloadable content and for each file you need to "solve" one Captcha quiz. Since the content is in high demand (say porn), a lot of users use your service.

Behind the scenes this site is running multiple bots. The bots cannot solve Captchas on the sites they are trying to get into, so they give them to this download site to let real users solve them. The people downloading (probably pirated) videos are then actually reading words and feeding them to bots.

If such a setup does not yet exist, I'll be surprised if it takes even a year before it happens.

I suspect that's exactly what's happening with sites like Rapidshare...

captcha is just flat out annoying. It certainly doesn't help with yahoo's chatrooms not that i go there anymore just sayin

The problem here is that CAPTCHA's were never meant for use in any type of security application. This is just programmers and companies re-purposing an existing system to add a layer of pseudo-security onto another process.

CAPTCHA's were intended to be used to prevent a bot from performing a process in an automated fashion (e.g., filling out forms or creating user accounts on a forum).

Proving you are human in order to login or perform another secure process adds no security.

This research was done on a variety of CAPTCHA systems. Download the slides PDF to view the results: http://ly.tl/p22s (Page 99)

Recaptcha, the system used by Google, performed flawlessly with a 0% solve rate.

Xinok said,
Recaptcha, the system used by Google, performed flawlessly with a 0% solve rate.

reCaptcha is not clear even for humans, I solve recaptchas on the first try just about 60% of the time (this also counts those time where i click to "change captcha" just because i am not very sure about it)

ramik said,

reCaptcha is not clear even for humans, I solve recaptchas on the first try just about 60% of the time (this also counts those time where i click to "change captcha" just because i am not very sure about it)

It's not that bad, I have maybe a 90% success rate. I prefer this over the days where they would lock you out of our account because somebody tried guessing your password.

Denis W said,
Captchas are ridiculous when they throw up text in East Asian scripts.

or mathematical symbols that belong in a physics or calculus book

Denis W said,
Captchas are ridiculous when they throw up text in East Asian scripts.

reCAPTCHA is used to digitize books. This is why, once in a while, you get something weird like this.

CAPTCHA's have been useless for years now... Not even sure why any site still uses them. Rate limiting is a far more useful option when mixed with other silent payloads such as one-time post tokens with a server side expiry.

Frazell Thomas said,
CAPTCHA's have been useless for years now... Not even sure why any site still uses them. Rate limiting is a far more useful option when mixed with other silent payloads such as one-time post tokens with a server side expiry.

My server has a javascript marker in it's comment forms. Robots will screw up the values and they will no longer match but the form itself is hidden so a human will leave them untouched. No captcha. The spam rate has been minimal.

It's got to the point where I can't even recognize the words half the time. You know you've lost when only bots can decipher the text and you can't.

shockz said,
It's got to the point where I can't even recognize the words half the time. You know you've lost when only bots can decipher the text and you can't.

This !

shockz said,
It's got to the point where I can't even recognize the words half the time. You know you've lost when only bots can decipher the text and you can't.

I doubt bots can decipher ancient greek, hieroglyphs, music notes, or other really weird stuff you see on reCAPTCHA.

Aethec said,

I doubt bots can decipher ancient greek, hieroglyphs, music notes, or other really weird stuff you see on reCAPTCHA.

Or the foreign characters. Sorry, reCaptcha, but my keyboard doesn't have an umlaut.

Shadrack said,

Or the foreign characters. Sorry, reCaptcha, but my keyboard doesn't have an umlaut.

If I'm not worng, you can use umlaut in US keyboards by configuring it to US-international and using " + letter

Of course its not going to stop 100% of bots, I doubt any single system would be capable of that. But it does stop 99% of bots written by amateurs who usually wish to do harm by abusing a registration system.

My boss insisted we need one against certain forms on our website. I have written my own, and use a graphic of the word - from a library of words related to our site - so it's sort of relevant - but still it's an unneccessary hurdle for users - in our case bots would not have been an issue - he just felt it more "professional"...

So, in general, this approach is a very bad idea with respect to security. For example, inventing your own encryption algorithm instead of using AES because you think you can do better will always lead to disappointment. Inventing your own captcha will also lead to similar disappointment.

As a side note, i've seen some implementations of captcha so bad that any student of "Image Processing 101" could defeat it.

Vannos said,
So, in general, this approach is a very bad idea with respect to security. For example, inventing your own encryption algorithm instead of using AES because you think you can do better will always lead to disappointment. Inventing your own captcha will also lead to similar disappointment.

As a side note, i've seen some implementations of captcha so bad that any student of "Image Processing 101" could defeat it.

I don't see anywhere that he was using his Capcha for encryption or anything critical, I don't see why you can't appreciate his innovation instead of getting uptight about it

The_Decryptor said,
My favourite captcha was the one Sony had on their forums, it was plain text.

Lol, really?? I could write an app to strip the captcha out of plain text in a few minutes lol

The_Decryptor said,
http://pro.sony.com/bbsc/jsp/forms/generateCaptcha.jsp

I doubt it'd even take a few minutes, took me 30 seconds to whip up some JS code for outputting it.


Ummm, how? Surely you didn't implement OCR in 30 seconds. And JS?
Edit: Nevermind. The actual plain text is in the page source. Weak! All you need is a regex.

Still, very weak. The letters don't even change position.

Edited by rfirth, Oct 31 2011, 7:04pm :