At any recent security conference lately, you probably have heard hundreds of vendors repeating the words "We have the best artificial intelligence (AI) and machine learning." If you happened to be in one of those conversations and asked "What does that mean?," you probably got a blank stare. Many security consumers are frustrated when marketing pitches don't clearly articulate what AI does in a product to help protect an environment better.
There are several dilemmas facing security companies that keep them from being more up-front about how they use AI and machine learning. For some, the concepts are a marketing statement only, and what they call AI and machine learning is actually pattern matching. Also, machine learning relies on a tremendous volume of data to be effective, and there are very few vendors that possess enough of it to be successful in its implementation.
To avoid a wasted investment in this technology, it's essential to understand the basics of what AI and machine learning provide in security tools. My goal is for you to be equipped to ask the right questions when a vendor proclaims "We have the best AI!"
What Is AI and What Does It Do?
There are many definitions of AI; the definition I use is that it's a system that can monitor behavior, learn, and then adapt and problem solve. The problem solving is where the machine learning component of AI comes into effect. An example of an AI use case is when a machine plays chess. Because there are so many permutations of options and movements in chess, it requires an AI system to observe adversary behavior, learn the consequence of these behaviors, and then formulate actions that result in a strategy to defeat a human opponent.
In contrast, a good example of pattern matching is playing checkers. Any simple computer program can run the limited number of moves and counter-moves an adversary would use in checkers based on the pattern of movement that the human establishes.
So, the first question a vendor should be asked is: "Is your AI doing pattern matching or problem-solving?" A good indicator that a security tool is just doing pattern matching is if a vendor tells you that it works right out of the box and doesn't need to learn the environment. Pattern matching isn't necessarily bad, but it isn't adaptable to ever-changing threats. This technique constantly requires vendor updates to stay alongside threats while never getting ahead of them.
[Check out the two-day Dark Reading Cybersecurity Crash Course at Interop ITX, May 15 & 16, where Dark Reading editors and some of the industry's top cybersecurity experts will share the latest data security trends and best practices.]
Another question to ask vendors is: "What type of machine learning does your system use?" Decision tree learning is common in what I'd refer to as pattern-matching systems. Basically, decision tree learning consists of mapping observations to predefined conclusions or actions. For example, "if I see this C2 domain in packet capture and detect this hash value in network intrusion monitoring, then I likely have this particular threat actor in my environment." This approach is great for lowering the amount of human touch or repetitive tasks your security team must do on a daily basis.
Another common machine learning strategy in security tools is a Bayesian model. This is where you transition between algebra with a decision tree to calculus with a Bayesian approach and, in my opinion, move from pattern matching to true AI. Essentially, the Bayesian model observes the state of many variables across your environment and maps them based on whether they're true or false into a data table that allows the AI system to determine the probability or confidence level that a particular event has happened. This approach doesn't have predefined conclusions on these observations; instead, it informs you of anomalous activity, based on the observation or status of many variables, that there's a high probability of malicious activity. The more data points you can process through a Bayesian model, the more accurate it becomes.
My favorite machine learning technique is clustering or k-means clustering. This is a machine learning system that plots a graph of what's expected from the telemetry of your environment in a clustered model. If you have a million dots in your cluster, and you see a dot that's plotted outside of the cluster, that's an anomaly that should be investigated. This learning system doesn't need a pattern; it's basically mapping what normal behavior looks like and identifies outliers.
Bringing It All Together
A good AI system will have elements of a decision tree (for known patterns), Bayesian analysis for anomaly detection, and clustering for baseline monitoring. If the AI system you want to buy can seamlessly integrate these three machine learning techniques, you're probably on the right track.
But you're not done yet. Next, you have to ask yourself: "What data do I need to provide to these tools, and do I collect the data needed to make these tools work the way they're designed?" Billions of dollars are wasted on strategies and tools because the users who buy them get wowed by a sales pitch and can't implement the tools when they get them.
Here's my advice for selecting an AI solution to enhance your security program:
1. Don't fall for the marketing pitch, and ask the critical questions.
2. Make sure you understand the machine learning strategies leveraged by the vendor and that they make sense for the data you have in your environment
3. Decide if you are playing checkers or chess with your adversary.
Understanding these points will help you make an informed decision that avoids wasted investments, time, and resources. While AI and machine learning create efficiencies that can't be duplicated by humans, the human element still must be in place to make sense of the information and process it properly so it can be used to achieve organizational objectives.