Machine learning has a perception problem. I recently met with a public company CEO who told me that "machine learning" has become an overused buzzword just like "big data" was a few years ago. Only it's even worse with machine learning because no one really understands what it means.
In the most common misperception, machine learning is thought to be a magic box of algorithms that you let loose on your data and they start producing nuggets of brilliant insight for you. If you apply this misperception to the use of machine learning for cybersecurity, you might think that after deploying machine learning, your security experts will be out of a job since algorithms will be doing all their important threat detection and prevention work.
[Read why Simon Crosby thinks Machine Learning Is Cybersecurity's Latest Pipe Dream.]
In Simon's commentary, he argues (three times, even) that experts are a better choice than ML/AI (Machine Learning/Artificial Intelligence) for cybersecurity. But why choose between experts and machine learning at all? A more enlightened understanding of machine learning in cybersecurity sees it as an arsenal of "algorithmic assistants" to help the security expert automate the analysis of data by looking for helpful anomalies and patterns -- but under the direction of the security experts.
Here's an example: A security expert doing malware research reads an article that contains an analysis of a version of the infamous Framework POS malware that exfiltrates data over the DNS protocol. Knowing what kind of security infrastructure is already in place, she thinks, "Hmm, if that exfiltration was done slowly enough on our network, I'm not sure we’d be able to detect it." Thinking a bit more, "Wow, I can really see how it could take some organizations months to detect a data breach that uses this method!"
She then configures her machine learning software to continually analyze DNS requests coming from all clients (POS and workstations) on their network, instructing the machine learning algorithms to create baselines of normal DNS request activity sent from each client, and to perform a population analysis across all clients in case some machines are already performing exfiltration when the analysis starts. The machine learning engine starts this analysis, and gives her an alert any time unusual behavior indicative of DNS "tunneling," is detected.
In this way, our security expert has just put one "algorithmic assistant" to work for her. It never sleeps, eats, or takes vacation, and it does exactly what she told it to do! Tomorrow, she thinks, "I'll figure out a way to put another algorithmic assistant to work looking for unusual SSH sessions, another issue I've been losing sleep over."
Machine Learning Algorithmic Assistants Have Several Skills
Almost all algorithmic assistants that utilize unsupervised machine learning have several skill sets based on modern data science. They can baseline normal behavior by accurately modeling time series data (any series of data with a time stamp on it – usually log data from servers, devices, endpoints, and applications); they can identify data points that are anomalous or "outliers;" and they can score the level of anomalousness of these outliers. Generally, you'll hear this set of skills packaged up under the term "machine learning anomaly detection."
More recent developments in machine learning-based security analytics have additional capabilities; think of these as "senior algorithmic assistants" that can take the work of their subordinate assistants and perform advanced functions such as influencer analysis, correlation, causation, and even forecasting, to provide even more context for the security experts.
Perception Problem: Maybe. Pipe Dream: No!
Here's an interesting data point: In an April 2015 survey performed by Enterprise Management Associates, for the second year in a row security analytics (Advanced Security/Threat Analytics & Anomaly Detection) scored in the top ranking for perceived value when compared to total cost of ownership (TCO), beating out 15 other security technologies.
For forward-thinking security pros, this kind of security analytics, powered by machine learning, is no pipe dream – and it's so much more than just marketing spin. It's a practical way to use newer technology to automate the analysis of log data to better detect cyberthreat activity, under the direction and guidance of an organization's security experts.