Incoming Bot

The purpose of a CAPTCHA is simple: protect a website from malicious attacks (i.e. spammers) by being difficult/impossible for bots but easy enough to let humans through. But what happens when the most commonly used CAPTCHA service can be solved with 97%+ accuracy by the very bots it was designed to beat?

For over a decade, text based CAPTCHAs have been the popular choice for this task. They grab a word (usually English), warp it into a shape not commonly seen and then ask users to type the words they see. Some text CAPTCHAs even use a random assortment of letters and numbers in an attempt to hinder the bots even more. The issue? Programs that utilize Optical Recognition Software, known as OCR, read the distorted text and allow bots through to websites that relied on the security service to prevent that very thing happening.

recaptcha-distorted-text

This, unfortunately, is a common problem. By design, text CAPTCHAs have a shelf life – in order for them to remain difficult for bots, they have to become increasingly harder for humans. It appears that we’ve reached the ceiling for text CAPTCHA effectiveness, which is a big motivation for our creation of FunCaptcha.

The internet was built on innovation and that’s exactly what we’re doing with FunCaptcha – innovating an area of web security that sorely needs it.

Update: watch co-founder and CAPTCHA expert, Matthew Ford, go into detail on this topic in our new video series!

At the start of December, a rather large update to the traditional reCAPTCHA technology was announced, dubbed the “No CAPTCHA reCAPTCHA” experience. For many, it came as a pleasant surprise – no more squiggly letters and hard-to-read numbers and images? What has been a frustrating experience for millions of internet users the world over looked to be getting a big injection of convenience.

The old ReCAPTCHA.

But when the mechanics behind the “new” technology were broken down via reverse engineering, many developers asserted that this newly developed convenience is merely the addition of a “whitelist”. To put it simply: user’s past behavior and previous CAPTCHA solves are recorded in their cookies, which are then detected by future reCAPTCHA challenges. Those that are seen as being genuine users get the “No CAPTCHA experience”, while those that aren’t get reverted back to the usual distorted text reCAPTCHA.

The new ReCAPTCHA.

The existing mechanics (and thus, flaws) behind the reCAPTCHA system are still there but with the introduction of this cookie “whitelist”, perhaps reCAPTCHA could be made easier for users, without simultaneously making it easier for bots. However – this looks to have backfired because of two main issues.

Easier for humans, easier for bots

The manner by which reCAPTCHA uses their new whitelist system has actually made it more easily exploited for no gain, according to www.sakurity.com consultant, Egor Homakov. In a blog post from December 4th, he eloquently sums up his findings (namely the whitelist and the consequences) but we wanted to break his findings down further and relate them to readers who may not have the experience necessary to fully grasp the conclusions Egor is coming to.

His first main concern is how relying on cookies for extra convenience doesn’t add any extra security at all. If the sole goal was to simply make it easier for humans without amplifying the existing security, then technically, it was a success. Egor declares this is important because the “No CAPTCHA reCAPTCHA Experience” doesn’t make it harder for botsjust easier for humans.

This is a problem, Egor says, due to the way the whitelist is implemented, allowing exploitation because “the legacy flow is still available and old OCR bots can keep recognizing” the old CAPTCHA.

For those making alternate CAPTCHAs, this was an interesting point of difference raised by Egor. For example, the FunCaptcha uses an approach opposite to how reCAPTCHA now does it. Instead of making it easier after repeated completions, FunCaptcha becomes harder after repeated mistakes. This is for two reasons:

1) To make a CAPTCHA that is inherently fast and easy for humans even easier would compromise its security against bots for no real gain.

2) A major vulnerability for visual CAPTCHAs with a small number of discrete answers is a brute-force attack by a bot, which performs automated guessing over and over until it breaks through. By tracking the history of the IP and making the CAPTCHA’s string of challenges longer after each failed attempt, a brute-force attack quickly becomes impractical.

Furthermore, many developers are puzzled by these changes – as explained by Egor’s findings, by trying to make the reCAPTCHA process more convenient, the latest changes have arguably compromised its security.

Removing Challenge/Response has removed the challenge – for bots

Egor further goes on to explain that by introducing the cookie whitelist as a replacement to the traditional “challenge/response” method, the service has become even more vulnerable to malicious attack via a process called “clickjacking”. If a valid cookie whitelist has been accumulated (known as “g-recaptcha-response”), then the user gets the “free pass”. How is this abused? Simply click the video below to get a look at the exploit in action.

Keep in mind: we are NOT providing the technical step-by-step recipe on HOW to do this – simply the result of the exploit being implemented.

To reword Egor’s assertion and explain the above video as simply as possible: the person wanting to spam a certain website needs to obtain a valid “g-recaptcha-response” that matches the required credentials of the targeted website via an unsuspecting user. This is done by creating a fake variant of the target website’s reCAPTCHA, having an unsuspected user complete this fake variant and then using the generated “g-recaptcha-response” to give bots access to the original target’s website through the now breakable reCAPTCHA. This is made possible due to the “g-recaptcha-response” token being made available before submission to the CAPTCHA.

SO WHAT DOES THIS ALL MEAN?

The conclusion that can be drawn from Egor’s findings? While the convenience of reCAPTCHA has somewhat increased for some users, so has the vulnerability. He proposes that the implementation of the cookie whitelist has not only opened the service to exploitation in and of itself, it has also opened a gateway into the existing technology by replacing challenge/response with “g-captcha-response” token.

CAPTCHA innovation has started to occur around the globe so there certainly are more options now. For developers of secure alternative types of CAPTCHA, the goal is to provide a method that, at its core, is already so quickly solvable that it makes room for the challenge to become lengthier in response to brute-force attacks, while still staying reasonable for humans accidentally caught in the net. Forcing it to become trivially solvable after building a whitelist of “human” behavior would be both pointless and potentially damaging – resulting in the position that Egor believes reCAPTCHA now finds itself in.