Endpoint
10/26/2017
02:30 PM
Connect Directly
LinkedIn
RSS
E-Mail vvv
100%
0%

Why Data Breach Stats Get It Wrong

It's not the size of the stolen data dump that is important. It's the window between the date of the breach and the date of discovery that represents the biggest threat.

Earlier this month, Yahoo announced that it had drastically underestimated the impact of the data breach it reported in 2016. In December 2016, the company reported that, in addition to its previous breach of 500 million accounts, an additional 1 billion accounts had been compromised in a separate breach. Now, it believes that all of its accounts were compromised — affecting over 3 billion users.

How did Yahoo get it so wrong? And what do these revised breach numbers mean? To understand this, we need to examine the dismal science behind calculating the impact of data breaches.

In one out of four cases, third parties discover the breach, typically after being affected by it or seeing the data distributed in darknets. In other cases, internal investigations discover anomalous behavior, such as a user accessing a database he shouldn't. Either way, it can be difficult to determine how much data was stolen. When a breach is discovered by third parties, it represents only a sample of exposed data, and the attacker may have accessed additional data.

In breaches found by internal probes, seeing that an attacker accessed one file or database does not mean other resources weren't accessed using methods that would appear differently to a forensic investigator (for example, logging in as one user to access one file, while using a different account to access another). There is a great deal of detective work — and estimation — required to describe the scope of a breach.

The round numbers that Yahoo provided illustrate this perfectly. Few consumer companies have sets of user information stored in neatly defined buckets of 500 million and 1 billion. At first, Yahoo likely found evidence to suggest certain data was accessed and had to extrapolate estimates from there. But there is always an element of estimation, or to put it another way, guesswork. Yahoo's latest statement reads: "the company recently obtained new intelligence and now believes [...] that all Yahoo user accounts were affected by the August 2013 theft." Companies are compelled to make, and revise, educated guesses at each stage to demonstrate control of the situation, and to be as transparent and responsible as possible. So it's not surprising that these figures often grow.

The Lag on Lag Time
Another challenge that affects a company's ability to get these figures right is lag time. The breaches reported by Yahoo happened years earlier. There was also a significant lag between breach detection and public notice with Equifax — nearly six weeks. Sophisticated attackers are not only adept at finding system vulnerabilities, but they typically have plenty of time to access data carefully and hide their tracks. Add in several years of buffer time for a company's own available analytics to atrophy for useful forensic purposes (e.g., regularly cleared log files) and detecting unauthorized access and estimating damage becomes even more difficult.

In many ways, lag time is the most critical problem. The days, weeks, and years that pass before breaches are discovered (if they are discovered at all), gives attackers all the time they need to extract full value from the data they have stolen. In the case of stolen usernames and passwords, they are used in credential stuffing attacks to compromise millions of accounts at banks, airlines, government agencies, and other high-profile companies. Businesses are starting to follow recent NIST guidelines that recommend searching darknets for "spilled" credentials. But, by then, the original attacker has already used the credentials to break into accounts. At that point, the credentials are commoditized and are becoming worthless.

So, how should we think differently about data breach statistics? First, we need to remember that huge breaches may have already occurred but remain undiscovered. By definition, we can discuss only the breaches we know about. The more sophisticated the attacker, the greater the likelihood that it will take time to detect their breach. The second thing to remember is that a data breach is like a natural disaster, in that it has follow-on effects throughout the Internet ecosystem and our economy, enabling credential stuffing, account takeover, identity theft, and other forms of fraud. The indirect impact of data breaches is harder to quantify than the scope of the original breaches, and may outstrip the original breach in total harm by orders of magnitude.

The larger a data breach is suspected to be, the more attention it receives. But the scope of the problem is vast and hard to quantify; the projected numbers are just the tip of the iceberg in representing the risk consumers and business users face. It's not the size of the stolen data dump that we need to focus on. It's the window between the date of the breach and the date of discovery that represents the real danger zone. This is when cybercriminals are doing the most harm, using stolen data to break into more accounts, steal more data and identities, and transfer funds. The smart move for every corporate user or consumer is to create strong passwords, never reuse them across sites, monitor financial accounts, and be cautious with all data and shared online services.

Related Content:

Join Dark Reading LIVE for two days of practical cyber defense discussions. Learn from the industry’s most knowledgeable IT security experts. Check out the INsecurity agenda here.

 

Shuman Ghosemajumder is chief technology officer at Shape Security, a security company located in Mountain View, California. As one of the largest processors of login traffic in the world, Shape Security prevents fraud resulting from credential stuffing attacks, when breached ... View Full Bio
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
5 Reasons the Cybersecurity Labor Shortfall Won't End Soon
Steve Morgan, Founder & CEO, Cybersecurity Ventures,  12/11/2017
Register for Dark Reading Newsletters
White Papers
Video
Cartoon Contest
Write a Caption, Win a Starbucks Card! Click Here
Latest Comment: Gee, these virtual reality goggles work great!!! 
Current Issue
The Year in Security: 2017
A look at the biggest news stories (so far) of 2017 that shaped the cybersecurity landscape -- from Russian hacking, ransomware's coming-out party, and voting machine vulnerabilities to the massive data breach of credit-monitoring firm Equifax.
Flash Poll
[Strategic Security Report] How Enterprises Are Attacking the IT Security Problem
[Strategic Security Report] How Enterprises Are Attacking the IT Security Problem
Enterprises are spending more of their IT budgets on cybersecurity technology. How do your organization's security plans and strategies compare to what others are doing? Here's an in-depth look.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2017-0290
Published: 2017-05-09
NScript in mpengine in Microsoft Malware Protection Engine with Engine Version before 1.1.13704.0, as used in Windows Defender and other products, allows remote attackers to execute arbitrary code or cause a denial of service (type confusion and application crash) via crafted JavaScript code within ...

CVE-2016-10369
Published: 2017-05-08
unixsocket.c in lxterminal through 0.3.0 insecurely uses /tmp for a socket file, allowing a local user to cause a denial of service (preventing terminal launch), or possibly have other impact (bypassing terminal access control).

CVE-2016-8202
Published: 2017-05-08
A privilege escalation vulnerability in Brocade Fibre Channel SAN products running Brocade Fabric OS (FOS) releases earlier than v7.4.1d and v8.0.1b could allow an authenticated attacker to elevate the privileges of user accounts accessing the system via command line interface. With affected version...

CVE-2016-8209
Published: 2017-05-08
Improper checks for unusual or exceptional conditions in Brocade NetIron 05.8.00 and later releases up to and including 06.1.00, when the Management Module is continuously scanned on port 22, may allow attackers to cause a denial of service (crash and reload) of the management module.

CVE-2017-0890
Published: 2017-05-08
Nextcloud Server before 11.0.3 is vulnerable to an inadequate escaping leading to a XSS vulnerability in the search module. To be exploitable a user has to write or paste malicious content into the search dialogue.