12:36 PM

Google Captcha Dumps Distorted Text Images

Tired of reading those wavy words? Changes to Google's reCaptcha system -- which doubles as quality control for its book and newspaper scanning projects -- prioritize bot-busting puzzles based on numbers.

9 Android Apps To Improve Security, Privacy
9 Android Apps To Improve Security, Privacy
(click image for larger view)
Google is making changes to its reCaptcha system: distorted text images are out, while numbers and more-adaptive, puzzle-based authentication checks are in.

The change is necessary because text-only Captchas are no longer blocking a sufficient number of automated log-in attempts, according to Google's reCaptcha product manager, Vinay Shet. "Over the last few years advances in artificial intelligence have reduced the gap between human and machine capabilities in deciphering distorted text," he said in a Friday blog post. "Today, a successful Captcha solution needs to go beyond just relying on text distortions to separate man from machine."

Based on extensive user testing, Google thinks it can better separate real users from bots by using better risk analysis. This is based in part on watching what a supposed user is doing before, during and after the check, and serving up multiple puzzle-based checks. Although Shet didn't spell out exactly what these puzzles might look like, he did say that unlike humans, bots have a tough time with numbers.

[ Twitter's new security measures can be a double-edged sword. Read Twitter Two-Factor Lockout: One User's Horror Story. ]

"We've recently released an update that creates different classes of Captchas for different kinds of users. This multi-faceted approach allows us to determine whether a potential user is actually a human or not, and serve our legitimate users Captchas that most of them will find easy to solve," he said. "Bots, on the other hand, will see Captchas that are considerably more difficult and designed to stop them from getting through."

The Captcha -- an acronym for Completely Automated Public Turing test to tell Computers and Humans Apart -- challenge-response technique was first developed at Carnegie Mellon University in 2000. The approach is designed to create a test that humans can pass, but computers can't. In theory, Captchas can be used for a variety of tasks, including preventing automated spam from appearing in blog comments, blocking automated spam-bot signup attempts for email services -- such as free Gmail accounts -- and safeguarding Web pages that site administrators don't want to be tracked by search bots.

In fact, Google purchased reCaptcha in 2009, in a bid to better block spammers who signed up for free accounts. The approach offered by reCaptcha was notable not just for presenting users with a Captcha phrase, but drawing those images from scans of books. That squares with Google's own Google Books and Google News Archive Search projects, which rely on optical character recognition (OCR) scans of printed source material, which aren't 100% accurate. By designating scanned content for use with the reCaptcha system, however, Google killed two birds with one stone: creating a security check, while also tapping users to manually enter or verify scanned text for free.

In short order, Google also rolled out -- and still offers -- reCaptcha as "a free anti-bot service that helps digitize books," and is available for use by any website. "Answers to reCaptcha challenges are used to digitize textual documents," according to Google's reCaptcha overview. "It's not easy, but through a sophisticated combination of multiple OCR programs, probabilistic language models, and most importantly the answers from millions of humans on the internet, reCaptcha is able to achieve over 99.5% transcription accuracy at the word level."

But no information security challenge-response system -- at least to date -- is perfect. Spam rings also have access to OCR tools, and have duly defeated many Captcha systems. Other criminal groups, echoing Google's crowd-sourced reCaptcha approach, have even tricked users into recording target sites' Captcha phrases -- most sites have a finite pool of possibilities -- with the lure of free porn.

By adopting a more adaptive approach to verifying people's identities via reCaptcha, Google has taken a page from Facebook's login verification system, which looks at a variety of factors when someone attempts to log into an account, including their geographic location, and whether they're using a computer that Facebook has seen before. For unusual types of log-ins, Facebook's system can hit would-be users with an escalating series of security challenges.

Similarly, RSA's Adaptive Authentication system, which is used by about 70 of the country's 100 biggest banks to verify their customers' identity, assesses a number of risk factors before granting access. Based on different risk factors, furthermore, users can also be made to jump through more hoops before the system believes that they are who they say they are.

It's been a busy month for Captcha researchers. Earlier this month, a team of Carnegie Mellon researchers unveiled an inkblot-based Captcha system that's designed to defeat automated attacks.

This week, startup firm Vicarious claimed it has created an algorithm that can successfully defeat any text-based Captcha system, as well as defeat reCaptcha -- widely seen as the toughest Captcha system available -- 90% of the time, New Scientist reported. But Luis von Ahn, who was part of the Carnegie Mellon team that created Captchas, remains skeptical, saying he's counted 50 such Captcha-breaking claims since 2003. "It's hard for me to be impressed since I see these every few months," he told Forbes.

Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
Arlo James Barnes
Arlo James Barnes,
User Rank: Apprentice
11/5/2013 | 6:21:40 PM
re: Google Captcha Dumps Distorted Text Images
"Although Shet didn't spell out exactly what these puzzles might look like, he did say that unlike humans, bots have a tough time with numbers."
Ironically, perhaps.
Register for Dark Reading Newsletters
White Papers
Cartoon Contest
Write a Caption, Win a Starbucks Card! Click Here
Latest Comment: This comment is waiting for review by our moderators.
Current Issue
Security Operations and IT Operations: Finding the Path to Collaboration
A wide gulf has emerged between SOC and NOC teams that's keeping both of them from assuring the confidentiality, integrity, and availability of IT systems. Here's how experts think it should be bridged.
Flash Poll
New Best Practices for Secure App Development
New Best Practices for Secure App Development
The transition from DevOps to SecDevOps is combining with the move toward cloud computing to create new challenges - and new opportunities - for the information security team. Download this report, to learn about the new best practices for secure application development.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
Published: 2015-10-15
The Direct Rendering Manager (DRM) subsystem in the Linux kernel through 4.x mishandles requests for Graphics Execution Manager (GEM) objects, which allows context-dependent attackers to cause a denial of service (memory consumption) via an application that processes graphics data, as demonstrated b...

Published: 2015-10-15
netstat in IBM AIX 5.3, 6.1, and 7.1 and VIOS 2.2.x, when a fibre channel adapter is used, allows local users to gain privileges via unspecified vectors.

Published: 2015-10-15
Cross-site request forgery (CSRF) vulnerability in eXtplorer before 2.1.8 allows remote attackers to hijack the authentication of arbitrary users for requests that execute PHP code.

Published: 2015-10-15
Directory traversal vulnerability in QNAP QTS before 4.1.4 build 0910 and 4.2.x before 4.2.0 RC2 build 0910, when AFP is enabled, allows remote attackers to read or write to arbitrary files by leveraging access to an OS X (1) user or (2) guest account.

Published: 2015-10-15
Cisco Application Policy Infrastructure Controller (APIC) 1.1j allows local users to gain privileges via vectors involving addition of an SSH key, aka Bug ID CSCuw46076.

Dark Reading Radio
Archived Dark Reading Radio
In past years, security researchers have discovered ways to hack cars, medical devices, automated teller machines, and many other targets. Dark Reading Executive Editor Kelly Jackson Higgins hosts researcher Samy Kamkar and Levi Gundert, vice president of threat intelligence at Recorded Future, to discuss some of 2016's most unusual and creative hacks by white hats, and what these new vulnerabilities might mean for the coming year.