As organizations worldwide scramble to restore their Windows XP S3 machines from crashes or repeated reboots due to a faulty virus definition update issued by McAfee yesterday, some security experts worry that additional machines could be affected weeks or months from now.
McAfee has apologized publicly for pushing the defective 5958 virus definition file, which caused some Windows XP Service Pack 3 systems to crash or continuously reboot; the company says less than 1 percent of its enterprise customers were affected. The faulty update, which passed McAfee's quality assurance testing process, generated a "false positive," the company says, incorrectly detecting and quarantining XP S3's svchost.exe as a virus.
According to a FAQ issued to McAfee corporate customers today, the company did not include XP SP3 with VSE 8.7 in its testing, resulting in "inadequate coverage of Product and Operating System combinations in the test systems used." The faulty AV update was removed from McAfee's download servers, and a new version has been released.
"We are not aware of significant impact on consumers," McAfee said in a statement. "We are investigating how the incorrect detection made it into our DAT files and will take measures to prevent this from reoccurring."
But there are still plenty of unanswered questions about the error -- what exactly went wrong in McAfee's quality assurance testing process, why McAfee wasn't testing sufficiently for the pervasive XP SP3 configuration, and what happens to XP SP3 machines that haven't yet been affected by the bad update, but could be later.
"It could have been anything from sabotage to just carelessness," says security expert Lucas Lundgren. "What scares me a little is haven't they tried this in a test environment before launching? And if they did, they have a serious problem on how they test their products."
Amrit Williams, CTO at BigFix and the former director of engineering at McAfee who helped develop the AV company's DAT testing process, says the incident is a major failure of McAfee's internal quality control process. "It's completely unacceptable," Williams says. "The fact that this got through indicates it was either malicious or negligent."
Organizations that don't apply the replacement DAT file McAfee issued could end up suffering crashes and repeated reboots: "Those customers should exclude svchost.exe from being scanned until they can apply the appropriate McAfee DAT file, which is now available," Williams says.
Peter Schlampp, vice president of marketing and product management for Solera Systems, says his firm has spoken to companies that are worried about these time bombs. "They are concerned that machines that they don't know about will get the DAT file ... They might not immediately exhibit the behavior" caused by the file, Schlampp says. That's because not all DLLs are loaded all the time, and not all host processes trigger the crash or constant reboot issue, he says.
"They are concerned it's going to manifest over time, and weeks or months from now they will find machines behaving like this," he says.
The bad AV update hit many companies hard: Lundgren says Sweden's largest local telephone company reported 17,000 of its machines knocked out of commission by McAfee's update, as well as 10,000 machines in municipal jurisdictions. "There's still a void [with] corporations that have not yet reported the issue," says Lundgren, who recently blogged about the update.
Security experts compare the DAT debacle to a virus outbreak. "The impact on some organizations is far worse than any virus clean-up," Williams says. "It's more than a false positive -- it's creating a massive denial-of-service for XP SP3."
Victim organizations have to physically work on each affected machine, either booting it into safe mode or reimaging the device altogether, he says.
McAfee's viral virus update isn't the first such incident, but experts say it may be one of the worst. A bad DAT file issued five years ago by one major AV vendor deleted email for hours, Solera's Schlampp says. "When you're relying on signature-based security, there's a battle against time to deliver signatures that fight the latest threats ASAP," he says. But vendors have to be sure these fixes do no harm, he says.
McAfee's mistake may have been one of the biggest ones to date, experts say. "This one happened to a critical Windows system file," Williams notes.
While false positives occur regularly for nonvital processes, they're a real problem when they hit vital process such as this one, Lundgren says. "I wouldn't get mad if McAfee thought that my DVD player software was a virus. But when it's a false positive on a vital process, it's really bad," he says.
Meanwhile, McAfee noted that customers who had the Virus Scan Enterprise software's "Scan processes" feature disabled were immune from the issue.
But experts say the workaround McAfee provided for the problem doesn't work on computers already experiencing the reboot loop or that couldn't start critical Windows systems files like svchost.exe.
BigFix's Williams says the incident in the long run undermines the integrity of the security industry as well as AV. "It's probably going to push some folks to look at alternative methods for endpoint protection," such as desktop virtualization for running security tools, he says. "It could have some good impact on the industry" to accelerate innovation beyond today's tools, he says.
Have a comment on this story? Please click "Discuss" below. If you'd like to contact Dark Reading's editors directly, send us a message.
Kelly Jackson Higgins is the Executive Editor of Dark Reading. She is an award-winning veteran technology and business journalist with more than two decades of experience in reporting and editing for various publications, including Network Computing, Secure Enterprise ... View Full Bio