Machine learning continues to be widely pursued by cybersecurity companies as a way to bolster defenses and speed response.
Machine learning, for example, has helped companies such as security firm Malwarebytes improve their ability to detect attacks on consumer systems. In the first five months of 2019, about 5% of the 94 million malicious programs detected by Malwarebytes' endpoint protection software came from its machine-learning powered anomaly-detection system, according to the company.
Such systems, and artificial intelligence (AI) technologies, in general, will be a significant component of all companies' cyberdefense, says Adam Kujawa, director of Malwarebytes' research labs.
"The future of AI for defenses goes beyond just detecting malware, but also will be used for things like finding network intrusions or just noticing that something weird is going on in your network," he says. "The reality is that good AI will not only identify that it's weird, but [it] also will let you know how it fits into the bigger scheme."
Yet, while Malwarebytes joins other cybersecurity firms as a proponent of machine learning and the promise of AI as a defensive measure, the company also warns that automated and intelligent systems can tip the balance in favor of the attacker. Initially, attackers will likely incorporate machine learning into backend systems to create more custom and widespread attacks, but they will eventually focus on ways to attack other AI systems as well.
Malwarebytes is not alone in that assessment, and it's not the first to issue a warning, as it did in a report released on June 19. From adversarial attacks on machine-learning systems to deep fakes, a range of techniques that general fall under the AI moniker are worrying security experts.
In 2018, IBM created a proof-of-concept attack, DeepLocker, that conceals itself and its intentions until it reaches a specific target, raising the possibility of malware that infects millions of systems without taking any action until it triggers on a set of conditions.
"The shift to machine learning and AI is the next major progression in IT," Marc Ph. Stoecklin, principal researcher and manager for cognitive cybersecurity intelligence at IBM, wrote in a post last year. "However, cybercriminals are also studying AI to use it to their advantage — and weaponize it."
The first problem for both attackers and defenders is creating stable AI technology. Machine-learning algorithms require good data to train into reliable systems, and researchers and bad actors have found ways to pollute the data sets as a way to corrupt the resultant system.
In 2016, for example, Microsoft launched a chatbot, Tay, on Twitter that could learn from messages and tweets, saying, "the more you talk the smarter Tay gets." Within 24 hours of going online, a coordinated effort by some users resulted in Tay responding to tweets with racist responses.
The incident "shows how you can train — or mistrain — AI to work in effective ways," Kujawa says.
Polluting the dataset collected by cybersecurity firms could similarly create unexpected behavior and make them perform poorly.
A number of AI researchers have already used such attacks to undermine machine-learning algorithms. A group including researchers from Pennsylvania State University, Google, the University of Wisconsin, and the US Army Research Lab used its own AI attacker to craft images that could be fed to other machine-learning systems to train the targeted systems to incorrectly identify images.
"Adversarial examples thus enable adversaries to manipulate system behaviors," the researchers stated in the paper. "Potential attacks include attempts to control the behavior of vehicles, have spam content identified as legitimate content, or have malware identified as legitimate software."
While Malwarebytes' Kujawa cannot point to a current instance of malware in the wild that used machine-learning or AI techniques, he expects to see examples soon. Rather than malware that incorporates neural networks or other AI technology, initial attempts at fusing malware with AI will likely focus on the backend: the command-and-control server, he says.
"I think we are going to see a bot that is deployed on an endpoint somewhere, communicating with the command-and-control server, [which] has the AI, has the technology that is being used to identify targets, what's going on, gives commands, and basically acts as an operator," Kujawa says.
Companies should expect attacks to become more targeted in the future as attackers increasingly use AI techniques. Similar to the way that advertisers track potential interested users, attackers will track the population to better target their intrusions and malware, he says.
"These things could create their own victim profiles internally," he says. "A dossier on each target can be created by an AI very quickly."