Fallout From Faulty Friday CrowdStrike Update Persists
Historic IT outage expected to spur regulatory scrutiny, soul-searching over "monoculture" of IT infrastructure — and cyberattack threats.
July 22, 2024
Echoes of the July 19 CrowdStrike glitch are likely to reverberate across the industry for years to come. For now, IT teams remain focused on slogging through a labor-intensive recovery.
But recovery is just the beginning. What's sure to follow is a barrage of regulatory oversight, hard feelings among the IT community, and a tough reminder that even a small slip-up in a software update can have catastrophic global consequences.
Cyber adversaries have also started to circle, eyeing an opportunity.
Windows in Recovery Mode
The faulty sensory configuration update to the Falcon Platform was released on July 19 at 4:09 UTC, according to CrowdStrike. Once the CrowdStrike update was pushed out, it triggered widespread Microsoft outages across CrowdStrike's 29,000 customers who rely on the company's software for cybersecurity endpoint detection and response (EDR). CrowdStrike's customers include retailers Target and Amazon, tech giants Alphabet and Intel, as well as many other household company names. When they tried to log on Friday morning, employees at some of the world's largest organizations were left staring at the dreaded blue screen of death. Airports, banks, hospitals, governments — there were few sectors spared the fallout — paralyzing the world's economy and causing panic.
It wasn't a cyberattack, CrowdStrike assured the world, just a glitch. But that was little comfort to IT teams who faced Friday with the task of manually booting affected PCs into recovery mode, deleting the bad file, and restarting. That process is still underway in many organizations.
"This is not something that can be done remotely, and in many organizations, will require an administrator," said Tom Marsland, vice president of technology for Cloud Range, in a statement. "This means someone from IT support going computer to computer and doing this manually."
Marsland predicted the recovery will take days, even a week or more, for some larger companies.
"Recovery is going to be painful, to put it lightly," Marsland added.
The Microsoft crash was unrelated to a July 18 Azure outage, which has already been remediated, according to a Microsoft spokesperson.
According to Microsoft, which says it has been working closely with CrowdStrike on remediating the issue, some 8.5 million Windows devices — less than 1% of all Windows machines — were affected by the flawed update.
"This incident demonstrates the interconnected nature of our broad ecosystem — global cloud providers, software platforms, security vendors and other software vendors, and customers. It's also a reminder of how important it is for all of us across the tech ecosystem to prioritize operating with safe deployment and disaster recovery using the mechanisms that exist," said David Weston, vice president of enterprise and OS security at Microsoft in a post over the weekend.
CrowdStrike Glitched
So how did a CrowdStrike update crash the world's computers? It's what they didn't do that was problematic, experts say.
David Brumley, a professor of Electrical and Computer Engineering Department at Carnegie Mellon University, sees a couple mistakes CrowdStrike made: in testing and the rollout.
"First, they didn't stress-test their updates enough," Brumley said in a statement provided to Dark Reading. "This needs to be done at two stages: stress-testing software components before they are assembled, and stress-testing the final software builds across operating system versions."
The missteps continued, according to Brumley.
"Second, they were not incremental enough in their rollout," he added. "That means everyone got the bad update at once. Companies like Google will roll out updates incrementally so if the update is bad, at least it will have limited damage.”
There's also the matter of rolling out the update on a Friday — a practice widely considered among IT professionals to be poor form.
"Deploying updates on a Friday is generally a bad idea due to several risks, as highlighted by the CrowdStrike incident," says Callie Guenther, senior manager, cyber threat research, at Critical Start. "Typically, IT teams are understaffed over the weekend, so if an update goes wrong, there are fewer people available to fix it."
She adds Friday rollouts also increase the odds the issue will go unnoticed over the weekend.
"Huge Deal"
As CrowdStrike claws out of this incident, the company is likely to face a whirlwind of scrutiny. The wisdom of rampant consolidation of software vendors is also likely to be examined, Andy Ellis, operating partner at YL Ventures, tells Dark Reading.
"I suspect that every regulator with even a smidgen of authority will be investigating, even if just to explore the vendor consolidation risk across so many different critical industries," Ellis says. "This has exposed how much of a monoculture our core infrastructure relies on."
By Friday afternoon, Federal Trade Commission chair Linda Khan seemed to make reference to the CrowdStrike outage on social media and noted the reliance on too few vendors has created "fragile systems," where a "... single glitch results in a system-wide outage."
Beyond lost profits and hours of work needed in the aftermath of the CrowdStrike outage, adversaries are already trying to capitalize. Both CrowdStrike's CEO George Kurtz and CISA warned that scammers are looking to take advantage of the chaos.
"We know that adversaries and bad actors will try to exploit events like this," Kurtz said in a statement. "I encourage everyone to remain vigilant and ensure that you're engaging with official CrowdStrike representatives."
As for CrowdStrike, the company will need to convince customers this is a one-off bungle. Contracts will likely protect CrowdStrike from any legal liability, Ellis explains, adding that ultimately it will be up to their customers to decide the company's fate.
"I suspect, like most software companies, that contractual limitations on liability will directly protect CrowdStrike, but that doesn't protect them from hard conversations with regulators, or with customers during their renewal cycles," Ellis adds. "This is a huge deal."
About the Author
You May Also Like
Transform Your Security Operations And Move Beyond Legacy SIEM
Nov 6, 2024Unleashing AI to Assess Cyber Security Risk
Nov 12, 2024Securing Tomorrow, Today: How to Navigate Zero Trust
Nov 13, 2024The State of Attack Surface Management (ASM), Featuring Forrester
Nov 15, 2024Applying the Principle of Least Privilege to the Cloud
Nov 18, 2024